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1  Objectives  and  Relevance 

The  goal  of  this  project  is  to  address  novel  quantitative  challenges  arising  in  network  vulnerability 
assessment  and  defense  measurement  in  the  face  of  cascading  failures  and  large-scale  attacks  such 
as  WMD.  More  specifically,  we  aim  to  accomplish  the  following  three  primary  goals: 

1.  Critical  Node  Detection.  We  investigate  the  network  vulnerability  under  multiple  attacks  with 
different  level  of  disruptions.  That  is,  we  aim  to  identify  the  most  critical  subsets  of  elements 
(such  as  nodes  and/or  edges)  whose  simultaneous  removal  maximizes  the  disruptive  effect  on 
the  network  in  term  of  connectivity.  The  attacks  considered  in  this  study  target  on  both  nodes 
and  edges  simultaneously.  For  the  networks,  we  first  study  static,  then  dynamic,  and  modeled 
them  into  evolving  networks,  and  finally  considered  a  system  consisting  of  two  interdependent 
networks.  The  study  helps  us  reveal  the  most  critical  location  that  we  need  to  protect  (defense 
strategies)  or  attack  to  break  down  adversarial  networks  (attack  strategies). 

2.  Network  Structural  Interdependency  and  Vulnerability  Assessment.  Since  understanding 
the  interdependency  of  network  structures  can  reveal  the  behavior  of  vulnerability  propagation, 
we  propose  to  investigate  the  network  interdependencies  based  on  their  underlying  topology 
focused  on  the  inter-  and  intra-dependencies  between  network  components  and  develop  a  the¬ 
oretical  framework  characterizing  these  interdependences,  which  has  not  been  provided  in  the 
literature.  To  achieve  these  goals,  we  introduce  several  new  models  based  on  an  observation  that 
most  complex  networks  exhibit  a  network  modular  property,  that  is  nodes  within  a  network  mod¬ 
ule  are  more  densely  connected  among  each  other  than  with  the  rest  of  the  network,  sometimes 
referred  as  community  structure.  A  network  module  may  represent  a  functional  group,  a  com¬ 
ponent,  or  an  entity  within  the  network  system  and  the  correlation  among  modules  can  model 
and  describe  the  interdependency  between  network  components.  Along  this  direction,  we  aim 
to  investigate  the  strength  of  community  structures,  how  to  break  them,  and  how  community 
structures  evolved  in  order  to  predict  responses  of  network  components  to  WMD  attacks. 

3.  Impact  Analysis  of  the  Power-Law  Degree  Distribution  on  Network  Vulnerability.  As  many 
real-word  complex  networks  such  as  the  Internet,  WWW,  communication  networks,  and  social 
networks,  have  the  degrees  that  follow  the  power-law  distribution,  we  investigate  how  this  spe¬ 
cial  property  will  impact  on  the  network  vulnerability  and  its  complexity.  In  particular,  we  aim 
to  (i)  provide  a  theoretical  framework,  analyzing  the  vulnerability  of  power-law  networks  under 
different  attacks,  (ii)  provide  near-optimal  algorithms  to  solve  many  NP-complete  problems  on 
power-law  networks,  and  (iii)  investigate  the  hardness  complexity  of  many  optimization  prob¬ 
lems  on  power-law  networks. 

Relevance.  The  study  offers  the  first  mathematical  study  on  network  vulnerability  and  defense 
considering  the  most  realistic  scenarios  where  attacks  are  dynamic  and  spreading,  and  failures  are 
cascaded  across  several  interdependent  networks  within  a  complex  system.  Therefore,  it  lays  a 
foundation  in  understanding  the  fundamental  properties  that  contribute  to  the  network  robustness 
under  WMD  attacks,  and  thus,  advances  the  state-of-the-art  in  modern  complex  network  theory 
and  multi-stage  optimization  algorithms. 

Some  of  their  applications  in  the  area  of  defense  are  as  follows: 
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•  The  applications  include  the  protection  of  moving  military  units  in  which  the  dynamic  network 
of  moving  vehicles  is  considered.  The  proposal  helps  to  design  a  more  robust  strategic  net¬ 
work  in  a  WMD  stressor  environment.  This  also  indicates  that  the  proposed  problems  can  be 
used  to  study  the  planning  paths  of  unmanned  vehicles,  with  wireless  units,  in  order  to  ensure 
communication  between  them.  It  is  also  applicable  in  ad-hoc  networks. 

•  Other  applications  include  emergency  responses  in  the  event  of  failures  in  transportation  net¬ 
works.  The  study  will  help  plan  the  allocation  of  resources  during  the  evacuation,  re-establish 
critical  routes  in  the  aftermath  of  a  disaster,  and  predict  the  network  responses. 

•  The  study  also  finds  applications  in  critical  infrastructures  such  as  communication  networks, 
wireless  networks,  transportation  networks,  and  power  systems. 

The  findings  also  have  major  impact  in  the  field  of  networks  science  and  complex  networks  as 
many  problems  studied  are  on  the  complex  networks  and  social  networks.  In  addition,  the  solutions 
obtained  from  this  project  can  be  used  in  different  field  such  as  vaccination  and  virus  contamina¬ 
tion.  The  results  obtained  are  very  basic  and  have  the  solid  impact  on  the  base  of  knowledge.  For 
example,  the  study  of  network  structure  is  very  important  to  understand  any  behavior  on  top  of 
that  networks.  The  study  of  hardness  complexity  is  important  to  design  algorithms  for  the  prob¬ 
lems.  The  project  crosses  several  research  areas  such  as  approximation  algorithm,  optimization, 
and  control  theory,  thus  it  has  a  profound  impact  on  these  fields  as  well. 

2  Major  Accomplishments 

We  have  obtained  the  following  findings,  corresponding  to  the  above  three  primary  goals. 

2.1  Critical  Node  Detection 

We  have  modeled  two  main  optimization  problems  as  follows: 

Definition  1  /3-Vertex  (Edge)  Disrupt  or  ( fj-VD  (ED)):  Given  a  graph  G  =  (V,  E)  and  0  <  (3  <  1, 
find  a  subset  of  vertices  (edges)  S  with  minimum  size  so  that  the  pairwise  connectivity  of  GW  \  ,Sj 
(G  =  (V,E\S))  is  at  most  (£) 

Definition  2  k-CND  (k-CED):  Given  a  graph  G  =  (V.  E)  and  a  positive  integer  k  <  \V\,  find  a 
subset  S  C  V  ( S  C  E )  so  as  to  minimize  the  pairwise  connectivity  ofG[V  \  S]  (G  =  (V,  E  \  S) ) 

Relevance.  We  assess  network  vulnerability  from  two  different  perspectives,  namely  attack  and 
defense.  For  example,  from  an  attack  point  of  view,  identifying  and  destroying  these  critical  nodes 
simultaneously  will  help  seek  a  maximum  destruction  in  terms  of  maximum  network  fragmenta¬ 
tion.  For  example,  we  can  apply  this  approach  to  destroy,  i.e.  arrest,  a  small  number  of  individuals 
in  an  adversarial  social  network  (e.g.  terrorist  network)  in  order  to  maximally  disrupt  the  networks 
ability  to  deploy  a  coordinated  attack.  From  a  defensive  view,  we  try  to  identify  the  nodes  and  edges 
that  are  considered  vital  and  need  to  be  secured.  In  a  broad  sense,  we  try  to  minimize  the  damages 
in  case  of  a  defense  or  maximize  damages  in  case  of  an  attack,  with  the  available  resources.  The 
proposed  methods  also  allow  us  to  accommodate  dynamic  and  evolving  networks,  which  are  much 
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more  crucial  because  dynamics  are  the  key  factors  that  need  to  be  explicitly  considered  in  military 
settings. 

The  /3-VD  (ED)  allows  us  to  break  the  networks  to  a  required  extent  with  a  minimum  cost 
while  the  A-CND  tries  to  maximize  the  damage  with  a  given  cost.  Furthermore,  different  value  of 
/3  allows  us  to  disrupt  the  network  at  different  levels.  The  metric  pairwise  connectivity  provides  a 
global  damage  view  to  the  networks,  instead  of  a  local  measurement,  an  existing  metrics  used  in 
the  literature. 

Summary  of  Findings 

1.  Provided  the  best  approximation  algorithm,  ()(\/logn)  ratio  for  /3-VD  and  /TED. 

2.  Provided  new  mathematical  programming  approach  to  find  an  exact  solutions  for  NP-complete 
problems  with  an  instance  up  to  a  thousand  nodes.  Current  available  method  (in  the  literature) 
is  only  able  to  solve  up  to  100  nodes. 

3.  Proved  the  spectral  bound  for  vulnerability  assessment  in  large-scale  networks.  The  result  helps 
us  to  determine  the  minimum  cost  needed  to  attack  (or  protect)  a  given  network  to  reduce  the 
network  connectivity  under  some  threshold 

4.  Provided  approximation  algorithms  in  the  case  of  dynamic  networks,  interdependent  networks, 
in  the  presence  of  cascading  failures 

5.  Provided  effective  defense  strategies  to  spreading  and  dynamic  attacks. 

6.  Provided  models  and  solutions  to  strengthen  the  network  modules  so  that  they  will  be  less  sen¬ 
sitive  to  changes,  thereby  improving  network  robustness  with  minimum  additional  costs. 

2.1.1  Detailed  Results  for  Static  Networks 

First  of  all,  we  investigated  the  above  2  problems  where  input  G  is  static.  We  can  see  G  as  a 
network  snapshot  at  time  t,  and  thus  it  is  static.  This  study  lays  a  foundation  to  solve  the  problems 
in  dynamic  and  evolving  networks  as  shown  later. 

We  have  shown  that  A;-CND  and  /TED  are  NP-complete  whereas  /TVD  is  MaxSNP-hard.  Fur¬ 
thermore,  we  proved  that  A;-CND  is  still  NP-complete  in  Unit  Disk  Graphs  and  Power-Law  graphs. 
We  have  proposed  two  pseudo-approximation  algorithm  for  /TED  and  /TVD  with  the  ratio  of 
O (  log 15  n)  and  0(log  n  log  log  n)  respectively  where  n  is  the  size  of  an  input.  Via  experimental 
results,  we  have  also  shown  that  the  new  model  is  a  better  way  to  assess  the  network  vulnerability. 
More  details  can  be  found  in  papers  [47,  52,  54].  We  have  included  here  the  pseudo-codes  as 
shown  in  Algorithm  /3-edge  Disruptor  and  /3-vertex  Disruptor. 

Exact  Solutions  and  Lower  Bound.  We  provided  the  first  exaction  solution  via  mathematical 
programming  for  /3-vertex  disruptor,  raising  the  size  of  the  largest  instance  solved  from  a  few  dozen 
to  several  hundreds.  This  technique  is  very  basic  and  can  be  applied  for  several  different  problems, 
not  only  to  the  /3-vertex  disruptor.  This  finding  goes  beyond  what  we  have  proposed.  The  result  is 
published  in  [50]. 
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Algorithm  /3-edge  Disruptor 

Input:  Uniform  directed  graph  G  =  ( V. ,  E )  and  0  <  /3  <  /3'  <  1 

Output:  A  /3'-e dge  disruptor  of  G. 

/*  Consttuct  the  decomposition  tree  */ 

1.  c  <-  1  — 

2.  T(Vt,Et )  <-  ({to},0),  V{to)  <r-  V(G),  /(to)  =  1 

3.  while  3  unvisited  t,t  with  \  V(tt)  >  2  do 

4.  Mark  t,:  visited,  create  new  child  nodes  tt\ ,  ti2  of  f,. 

5.  I(til),l(ti2)  <—  /(tj)  +  1. 

6.  Vr  <—  Vr  U  {tji,  t*2} 

7.  •<— U  {(tj,  tji),  (tj,  tj2)} 

8.  Separate  G[U (tj)]  into  two  using  directed  c-balanced  cut. 

9.  Assign  two  obtained  partitions  to  V (fa),  V (t*2) 

10.  cost(ti )  The  cost  of  the  balanced  cut 

1 1 .  end  while 

/*  Find  the  minimum  cost  G-partitionable  */ 

12.  for  tj  G  T  in  reversed  BFS  order  from  root  node  t0  do 

13.  forp  -t—  0  to  fi' (O) 

14.  if  V(G[V(ti)])  <pthen 

15.  cost(ti,p )  0 

16.  else 

17.  cost(ti,p )  min{cost(fji,pi)  + 

COSt(ti2,  P2)  +  COSt(U )  |  Pi  +  P2  =  p} 

18.  Find  F  with  V(F)  =  min {cost(t0,p)  \  p  <  (3' (") } 

19.  Return  union  of  cuts  used  at  A(F)  during  tree  construction 
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Algorithm  /^'-vertex  disruptor 

Input:  Directed  graph  G  =  (V.  E )  and  fixed  0  <  (3'  <  1. 

Output:  A  /^'-vertex  disruptor  of  G 

1 . 

2.  \/v  E  V  :  V'  <-  V'  U{v+,v~} 

3.  Vn  G  V  :  E'  <-  E'  U  {(v~  -G  u+)},  c(ir,  v+)  <-  1 

4.  V(w  — >■  v)  G  E  :  E'  <—  E'  U  {m+  — *  n-},  c(u+,  v~)  4—  oo 

5.  0,/3  <-  1 

6.  <-  y(G) 

7.  while  (/?  —  f3  >  e)  do 

8.  /3  <-  x  e 

9.  Find  I)r  C  E'  to  separate  G'  into  strongly  connected 

components  of  sizes  at  most  j3\V'\  10.  Dv  {-  {i;6  V(G)  |  ( v+  — >  v~ )  G  De } 

11.  if  V{G[V\Dv])  <PQ)  then 

12.  §_  =  p 

13.  Remove  nodes  from  Dv  as  long  as  V(G[V\Dv])  < 

14.  if  |  Dy  |  >  \DV\  then  Dv  =  Dv 

15.  else 

16.  /3  =  (3 

18  . end  while 
19.  Return  Dy 
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We  have  also  proven  the  spectral  bound  for  link  vulnerability  assessment  in  large-scale  net¬ 
works.  The  result  helps  us  to  determine  the  minimum  cost  needed  to  attack  (or  protect)  a  given 
network  to  reduce  the  network  connectivity  under  some  threshold.  Two  major  findings  along  this 
direction  are:  (1)  We  introduced  a  new  spectral  lower-bound  for  the  /3-edge  disruptor  problem  in 
form  of  an  eigenvalue  optimization  problems.  At  the  same  time,  we  enriched  the  literature  on 
lower-bound  techniques.  (2)  We  presented  two  efficient  methods  to  compute  the  proposed  lower- 
bound:  a)  the  Lagrange  multiplier  method  and  b)  the  dynamic  programming  algorithm.  Moreover, 
the  Lagrange  multiplier  method  can  derive  the  lower-bound  with  only  a  small  number  of  smallest 
eigenvalues.  This  is  important  for  large  networks  where  computing  the  whole  network  spectrum  is 
both  time  and  memory  consuming.  The  result  has  been  published  in  [9,  29]. 

Generalization.  We  next  generalized  the  problems  to  the  case  that  both  edges  and  vertices  are 
being  attacked  at  the  same  time  as  in  the  following  model. 

/3-disruptor.  A  //-disrupt or  is  a  pair  of  subsets 

Dp  =  (Vp  C  V,Ep  C  E) 

that  removal  from  G  will  make  the  pairwise  connectivity  in  the  residual  graph 

G'  —  (V  \  Vp,  E  \  ( Ep  U  VpX  Vp))  to  be  at  most  //(").  The  (3-disruptor  problem  asks  for  a  3- 

disruptor  with  the  minimum  total  cost 

c(Dn)  =  c('u)  +  c (e)- 

u£Vf 3  e&Ep 

We  further  notice  that  with  the  same  attack  cost,  if  we  are  allowed  to  attack  both  edges  and 
nodes,  instead  of  either  edges  or  nodes  separately,  a  network  can  be  destroyed  to  a  larger  extent. 


Figure  1 :  Minimum  cost  solutions  to  reduce  50%  of  the  connectivity  assuming  links  have  cost  2  and  nodes 
have  cost  3  a.  node  only  &  b.  link  only  c.  joint  nodes  &  links.  The  minimum  cost  is  6  if  attacking  only 
nodes  or  only  links,  and  is  5  if  both  links  and  nodes  are  targeted.  Thus,  it  is  insufficient  to  study  node  and 
link  attacks  separately. 

Fig.  1  also  illustrates  a  fundamental  shortcoming  of  existing  work:  the  ability  to  assess  network 
vulnerability  under  joint  node  and  link  attacks.  The  three  sub-figures  show  the  minimum  cost 
attack  strategies  to  reduce  fd  =  50%  pairwise  connectivity,  assuming  each  link  has  cost  2  and  each 
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node  has  cost  3.  While  the  minimum  costs  for  both  node-attack  (Fig.  la)  and  link-attack  (Fig.  lb) 
are  6,  the  minimum  cost  for  node-link  attacks  (node  3  and  link  (6,  7))  (Fig.  lc)  is  only  5.  Thus, 
it  is  insufficient  to  assess  link  vulnerability  and  node  vulnerability  separately  when  both  links  and 
nodes  in  the  network  can  be  targeted.  To  make  matters  worse,  assume  node  3  and  link  (6,  7)  have 
the  same  cost  e  >  0,  the  minimum  costs  for  node,  link,  and  node-link  attacks  will  be  3  +  e,  4  +  e, 
and  2e,  respectively.  As  the  ratios  (3  +  e)/ (2e)  and  (4+e)/ (2e)  go  unbounded,  the  existing  methods 
can  seriously  misjudge  the  network  vulnerability. 

To  address  this  shortcoming,  we  studied  the  effect  of  joint  node  and  link  attacks  in  term  of  con¬ 
nectivity.  We  introduced  a  new  problem,  called  /3-disruptor,  that  finds  a  minimum  cost  set  of  nodes 
and  links  whose  removal  degrades  the  pairwise  connectivity  to  a  great  extent  (a  fraction  /3).  The 
/3-disruptor  problem  aims  to  provide  a  more  comprehensive  assessment  on  network  vulnerability. 
It  generalizes  both  the  /3-vertex  disruptor  and  the  /3-edge  disruptor  problems.  Along  this  direction, 
we  proposed  an  O  ( v^log  n) -pseudo  approximation  algorithm  for  this  /3-disruptor  problem.  The 
main  challenge  is  to  address  a  right  cut  on  directed  graphs  as  the  sparsest  cut  is  no  longer  suitable. 
The  results  is  shown  in  [7].  This  result  also  improves  our  previous  results  of  0(logL5  n)  for  /3-ED 
and  0(log  n  log  log  n)  for  /3-VD. 

Variants.  We  further  observed  that  even  before  the  network  beings  fragmenting  into  pieces,  its 
Quality  of  Service  (QoS)  may  already  drop  to  an  intolerant  low  level,  and  the  network  can  no 
longer  provide  services.  To  this  end,  we  presented  a  novel  QoS-aware  vulnerability  assessment 
framework,  called  QoS-Critical  Vertices  (QoSCV)  /  QoS-Critical  Edges(QoSCE)  as  follows. 

Definition  3  QoS-Critical  Vertices  ( QoSCV)  /  QoS-Critical  Edges( QoSCE):  Given  a  directed  graph 
G(V,  E,  s,  t )  with  m-dim  edge  weight  vector  ( u ,  v )  G  E:  ( w\[u ,  v),w2(u,  v),  ■  ■  ■  ,  wm(u,  v)).  The 
weight  vector  for  each  s  —  t  path  P  is  defined  as  (wi(P),  w2(P),  ■  ■  ■  ,wm(P))  where  wfiP)  = 
Yh(u  V)€P  wi{ui v)  for  *  £  [1,  •  ’  ’  >  tn\.  Given  a  constraint  threshold  vector  (ci,  c2,  •  •  •  ,  cm )  with 
corresponding  credit  vector  (Ai,  A2,  •  •  •  ,  Am),  an  s  —  t  path  P  satisfies  the  ith  constraint  (denoted 
as  p  cc  i)  iff  w,  (P)  <  Ci,  and  an  SAT  score  4>(P)  is  defined  as  as:  o(P)  =  J2j  Pod  -\r  The  SAT 
score  for  the  graph  G  is  0(G)  =  maxpgc  0(E),  he.  the  maximum  score  among  all  s-t  paths.  The 
QoSCV/QoSCE  problem  is  to  find  a  minimum  set  S  of  edges/vertices  such  that  0(G  \  S)  <  pfor  a 
given  score  threshold  p. 

We  have  provided  an  exact  algorithm  in  case  of  small  m  and  a  heuristic  in  case  of  arbitrary  m. 
We  include  here  the  psedo-code  as  shown  in  Algorithms  1  and  2.  All  the  proofs  and  experimental 
results  can  be  found  in  [56]. 

2.1.2  Detailed  Results  for  Dynamic  and  Evolving  Networks 

We  next  investigated  critical  node  identification  in  dynamic  networks  by  two  approaches:  (i)  using 
the  time-series  snapshot  to  represent  the  data,  and  (ii)  using  the  probabilistic  graphs. 

For  the  time-series  snapshot,  given  a  Gi,  G2,  •••,  Gt  representing  the  network  at  time  3 , ,  t2, ....  th 
find  the  set  of  critical  nodes  at  time  C  based  on  the  solution  of  time  3,:_  i .  Thus  the  solution  was 
adaptive  from  previous  time  step,  rather  than  computing  it  from  the  scratch.  Therefore,  the  al¬ 
gorithm  is  able  to  handle  the  dynamics,  thus  providing  a  better  tool  to  adaptively  identify  a  set 
of  critical  nodes  during  a  sequence  of  attacks,  or  during  the  recovery  process.  We  have  provided 
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Algorithm  1:  Exact  Algorithm  MFMCSP 

1:  Input:  directed  graph  G  =  (V,  E),  constraint  set  M  =  {ci,  •  •  •  ,  cm},  credit  vector  (Ai,  A2,  •  •  •  ,  Am),  satisfactory  score  threshold  p; 
2:  Output:  solution  set  of  edge  of  QoSCE. 

3:  S  <—  all  the  minimal  combinations  ss  of  M  with  52c  Gss  ^  '>  P' 

4:  for  each  edge  (i,  j )  E  E  do 
5:  Set  =  f  (j,  i)  =  0; 

6:  Set  Cf  (i,  j)  =  1  and  Cf  (j,  i)  =  0. 

7 :  end  for 

8 :  while  S'  7^  0  do 

9:  ss  4—  extracted  from  S'; 

10:  while  3q  4—  the  shortest  path  satisfying  all  the  constraints  in  ss  do 

1 1 1  for  each  edge  ( u ,  v)  E  q  do 

12:  Cf(q)  =  min {cf(u,v)  :  (u,v)  E  q}', 

13:  f(u,v)  =  f(u,v)  +  cf(q);  f(v,u)  =  - f(u,v ); 

14:  c/(w,  =  c(u,  u)  —  f(u,  t>);  Cf  (v,  u)  =  c(d,  ti)  —  /( v,  u ); 

1 5 :  end  for 

16:  end  while 

1 7 :  end  while 

18:  all  the  vertices  reachable  from  s  on  the  residual  network  induces  a  cut  T. 

19:  Return  T. 


Algorithm  2:  SDOP 

1 :  Input:  directed  graph  G  =  (V,  E),  constant  p\ 

2:  Output:  a  set  D  of  edges  to  be  removed. 

3:  Set  Z)  <—  0; 

4:  while  SAT_TEST(G)  ==  YES  do 

5 :  find  all  m  single  metric  shortest  paths  (pi ,  ■  ■  •  ,  Pm  } 

6:  for  all  edges  e  £  Edo 

7 :  Find  the  one  appears  in  the  maximum  number  of  such  path. ; 

8 :  end  for 

9:  Dt-flU  {e};  E  E\  {e}; 

10:  end  while 
1 1 :  Return  D. 


several  fitness  functions  for  each  case  of  changes  in  a  network,  including  node  insertion/removal 
and  link  insertion/removal.  Based  on  these  fitness  functions,  we  provided  two  adaptive  algorithms, 
one  for  the  critical  node  identification,  and  another  for  the  critical  link  identification.  More  details 
can  be  found  in  [30,  3 1  ] . 

For  the  probabilistic  network  approach,  we  modeled  a  dynamic  network  using  the  probabilistic 
network  model  which  allows  us  to  incorporate  the  network  uncertainty  (such  as  links  and  nodes 
disappear  and  reappear  over  time)  and  its  prediction.  Assessing  network  vulnerability  in  proba¬ 
bilistic  networks  introduces  a  major  challenging  problem,  that  is,  to  compute  the  expected  pairwise 
connectivity  (EPC)  in  a  network.  This  problem  is  related  to  a  famous  open  problem  in  network 
reachability.  We  have  shown  that  computing  EPC  is  #P-complete  and  presented  several  sampling 
methods  to  compute  such  a  value.  Based  on  this  sampling  methods,  we  further  developed  solu¬ 
tions  to  find  a  set  of  k  critical  nodes  by  formulating  the  problem  as  a  mathematical  programming 
problem  and  devising  two  approaches  to  overcome  the  difficulty  of  having  an  exponential  number 
of  constraints  in  the  mathematical  formulation. 

Formally,  we  studied  the  following  problem: 

A  -Probabilistic  Critical  Nodes  Problem  (fc-pCNP).  Given  a  probabilistic  network  Q  =  ( V,E,p ) 
and  an  integer  0  <  k  <  n,  find  a  k  nodes  subset  S  C  V  that  removal  minimizes  the  expected 
pairwise  connectivity  (EPC)  in  the  residual  network  after  removing  the  nodes  in  S  where  EPC  is 
defined  as  follows: 
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Algorithm  3:  SAT  TEST 

1 !  Input:  directed  graph  G  =  (V,  E),  constant  p; 

2:  Output:  YES  if  a  satisfactory  path  probably  exists,  NO  otherwise. 
3 ;  for  every  edge  e  G  Edo 

4:  vie)  «-  E™  i  f  E; 

5 :  end  for 

6:  p  <—  shortest  s-t  path  on  metric  ip; 

7:  if  y(p)  >  p  then 

8:  Return  NO; 

9:  else 

10:  Return  YES; 

1 1 :  end  if 


EPC(S)  =  \  E  REL«,»(5) 

u,vEV  ;u^v 

where  RELU  V{Q)  is  the  probability  that  v  is  reachable  from  u  within  Q. 

We  have  proven  that  computing  this  EPC  is  #P-complete  (details  of  the  proof  can  be  found 
in  [13]),  let  alone  finding  a  set  of  such  k  nodes.  Since  every  #P-complete  problem  either  has 
a  fully  polynomial  randomized  approximation  scheme  (FPRAS)  or  is  essentially  impossible  to 
approximate,  we  are  interested  in  (e,  5) -approximations  for  EPC(ty),  i.e.,  algorithms  returning  an 
estimation  of  EPC(fy)  accurate  to  within  a  relative  error  of  e  factor  with  probability  at  least  1  —  5. 
An  (e,  (^-approximation  is  called  an  FPRAS  if  its  running  time  is  bounded  by  a  polynomial  in 
1/e,  log(l/5),  and  the  instance  input  size.  Along  this  direction,  we  have  developed  the  following 
algorithm. 

Algorithm  (e,  5)  Component  Sampling  Algorithm  to  compute  EPC(£?) 

1.  Fet  Pe  =  Ee£EPe 

2.  if  PE  <  |n-2  then 

3.  return  S2  =  Pe- 

4.  C2  A-  0. 

5.  for  i  =  1  to  A2(e,  5)  do 

•  Select  a  node  u  E  V  uniformly. 

•  Start  a  Breath-First  Search  from  u.  For  each  encountered  edge  (v,  w ),  flip  a  coin  of 
bias  pvw  to  determine  its  availability. 

•  Fet  Si  be  the  number  of  visited  nodes. 

•  c2  =  C2  +  (Si-  1). 

6.  Return  S2  =  as  an  unbiased  estimator  of  EPC(^). 


The  beauty  of  the  above  algorithm  is  that  it  is  proven  to  be  an  FPRAS.  The  proofs  can  be  found 
in  [13].  This  fast  and  accurate  computing  algorithm  can  be  used  in  many  applications  that  need  to 
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compute  the  pairwise  connectivity  such  as  the  network  reachability  problem. 

We  next  present  our  solution  for  the  A-pCNP  which  is  a  two-stage  stochastic  programming, 
which  is  formulated  as  follows: 


min  E[P(s,  x,  £)]  (1) 

se{o,i}n 

n 

s.  t.  ^  Si  <  k  (2) 

i= 1 

where  P(s,x,  £)  =  min  ^(1  —  xtJ)  (3) 

i<j 

s.  t.  Xij  <  Si  +  Sj  +  1  -  (i,j)  G  E,  (4) 

Xij  +  xjk  >  xik,  (i,  j)  G  E,  k  =  l..n  (5) 

Xij  =  Xji,  i,j  =  l.:n  (6) 

s  g  {o,  i}n,x  g  [o,  if2  (7) 


Discretization.  To  solve  the  stochastic  program  numerically,  one  needs  to  consider  all  possible 
realization  Gl  e  and  their  probability  masses  fg(Gl).  Then  the  two-stage  stochastic  program 
can  be  written  as  a  (one-level)  mixed  integer  programming,  denoted  by  MIP^: 


N 

min  MG1)  -  4j)  (8) 

1=1  i<j 

n 

s.  t.  ^  Si  <  k  (9) 

i=  1 

x\j  <  Si  +  Sj  +  1  —  (i,j)  G  E,l  —  1..N  (10) 

>x-fc,  (i,  j)  e  E,k  =  l..n,  l  =  1..N  (11) 

x\j  =  xji ,  T7  =  l-n,Z  =  l..iV  (12) 

s  G  {0,l}n,x^  G  [O,!]"-2,  I  =  1..N  (13) 


The  major  challenge  in  solving  this  discretized  form  is  that  there  is  an  exponential  number  (TV  = 
2 Is!)  of  variables  and  constraints.  Thus,  solving  MIP/  is  intractable  even  for  very  small  instances 
of  Q.  To  overcome  this  difficulty,  we  developed  two  approximate  mathematical  programs  of  sub¬ 
stantially  smaller  sizes:  (1)  Approximating  via  the  expectation  graph  and  (2)  Sample  average  ap¬ 
proximation  method  which  reduces  the  number  of  realizations.  The  two  solutions  are  shown  in  the 
following  pseudo-codes. 
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Algorithm  Rounding  on  the  Expectation  Graph  Algorithm  (REG A) 

1.  Obtain  an  LP  relaxation  of  MIP^  with  the  relaxed  constraints  s  G  [0,  l]n. 

2.  Initialize  the  set  of  selected  nodes  D  —  0. 

3.  Repeat  k  times  the  following  steps 

•  Solve  the  LP  relaxation 

•  Select  u  =  arg  max  e  V  \  Dsi. 

•  Add  u  to  D  and  fix  ,su  —  1 

4.  Return  k  critical  nodes  in  D. 

Algorithm  Sample  Ave.  Approx.  Algorithm  (SA3) 

Parameter  T:  the  number  of  sampling 
Phase  1:  Delayed  Constraints 

1.  Initialize  an  LP  with  the  objective  ^  Ya= i  Hi<j(  1  —  x\j)  and  only  the  constraints 

se[0,i]",4e[0,i]. 

2.  for  1  —  1..T  do 

•  Generate  the  Ith  sample  of  Q  (adjacency  matrix  £}). 

•  Add  the  constraints  involved  x\ ,  to  the  LP 

•  Solve  the  updated  LP 

Phase  2:  Iterative  rounding 

3.  Initialize  the  set  of  selected  nodes  D  =  0. 

4.  Repeat  k  times  the  following  steps 

•  Select  u  =  arg  max  sy. 

iev\D 

•  Add  utoD  and  fix  su  =  1 

•  Re- solve  the  LP 

5.  Return  k  critical  nodes  in  D. 


2.1.3  Detailed  Results  for  Interdependent  Networks 

We  finally  investigated  the  critical  node  detection  problems  on  interdependent  networks.  Although 
there  exist  some  work  on  the  vulnerability  assessment  of  interdependent  networks,  most  of  them 
focus  on  the  artificial  models  of  interdependent  networks,  i.e.,  random  interdependency  between 
networks,  and  ignore  the  detection  of  top  critical  nodes  in  real  networks.  Therefore,  we  study  the 
following  new  optimization  problem: 
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Definition  4  (IPND  problem)  Given  an  integer  k  and  an  interdependent  system  3(GS,  Gc.  Esc), 
which  consists  of  two  networks  Gs  =  ( VS,ES ),  Gc  =  (Vc,  Ec)  along  with  their  interdependencies 
Esc.  Let  LG  S{T )  he  the  size  of  the  largest  connected  component  ofGs  after  the  cascading  failures 
caused  by  the  initial  removed  of  the  set  of  nodes  T  C  Vs  in  Gs.  The  IPND  problem  asks  for  a  set  T 
of  size  at  most  k  such  that  LGS{T )  is  minimized. 

We  used  a  well-accepted  cascading  failure  model,  which  has  been  validated  and  applied  in 
many  previous  works.  Initially,  there  are  a  few  critical  nodes  that  fail  in  network  Gs,  which  dis¬ 
connects  a  set  of  nodes  from  the  largest  connected  component  of  Gs.  Due  to  the  interdependency 
of  two  networks,  all  the  nodes  in  Gc  that  connect  to  failed  nodes  in  Gs  are  also  affected,  and  there¬ 
fore  stop  working.  Furthermore,  the  failures  cascade  to  nodes  which  are  disconnected  from  the 
largest  connected  component  in  Gc  and  cause  further  failures  back  to  Gs.  The  process  continues 
back  and  forth  between  two  networks  until  there  are  no  more  failure  nodes. 

Our  major  findings  are  two  fold:  (1)  We  showed  the  (2  —  e)-inapproximability  of  the  IPND 
problem  on  interdependent  networks;  and  (2)  We  provided  the  greedy  framework  with  various 
centralities  to  solve  the  IPND  problem.  We  validated  the  performance  of  our  solutions  on  a  wide 
range  of  interdependent  networks  with  different  scales,  topologies,  and  interdependencies.  The 
proposed  centrality  function  on  interdependent  networks  is  very  important  since  it  can  be  served 
as  a  basic  function  for  many  research  works  in  the  future  which  requires  a  centrality  function. 

In  a  maximum  cascading  (Max-Cas)  algorithm,  we  iteratively  select  a  node  u  that  leads  to  the 
most  number  of  new  failed  nodes,  i.e.,  the  maximum  marginal  gain  to  the  current  set  of  attacked 
nodes  T.  When  a  new  node  u  fails,  it  results  in  a  chain  of  cascading  failures.  The  number  of  new 
failed  nodes,  referred  to  as  cascading  impact  number,  can  be  computed  by  simulating  the  cascading 
failures  with  the  initial  set  T  U  {  «}  on  the  interdependent  system  3.  However,  the  simulation 
of  cascading  failures  is  time-consuming  due  to  its  calculation  of  cascading  failures  between  two 
networks.  Each  step  in  the  cascading  requires  to  identify  the  largest  connected  component  of  each 
network. 

To  this  end,  we  further  improved  the  running  time  of  our  algorithm  by  reducing  the  number  of 
simulations.  The  idea  is  to  check  potential  nodes  whose  removal  creates  at  least  one  more  failed 
node  in  the  same  network  due  to  the  cascading  failures.  That  is,  this  node  (or  its  coupled  node) 
disconnects  the  network  to  which  it  belongs,  i.e.,  it  (its  coupled  node)  is  an  articulation  node  of 
Gs  (or  Gc),  which  is  defined  as  any  vertex  whose  removal  increases  the  number  of  connected 
components  in  Gs  (or  Gc).  The  reason  is  illustrated  in  the  following  lemma. 

Lemma  1  Given  an  interdependent  system  3 (GS,GC,  Esc),  removing  a  node  u  G  Vs  from  the 
system  causes  at  least  one  more  node  fail  due  to  the  cascading  failure  ijfu  (or  its  coupled  node 
v  G  Vc)  is  an  articulation  node  in  Gs  (or  Gc). 

According  to  this  property,  the  proposed  algorithm  first  identifies  all  articulation  nodes  in  both 
residual  networks  using  the  Hopcroft  and  Tarjan’s  algorithm.  Note  that  this  algorithm  runs  in  linear 
time  on  undirected  graphs,  which  is  faster  than  one  simulation  of  cascading  failures.  Thus,  the  run 
time  of  each  iteration  is  significantly  improved  especially  when  the  number  of  articulation  nodes 
is  small.  Denote  Max  —  Cas(Gs,  T,  {«})  as  the  impact  number  of  u,  Algorithm  4  describes  the 
details  to  detect  critical  nodes.  In  Algorithm  4,  since  it  takes  0(n )  time  to  compute  the  cascading 
impact  number  for  each  node  and  at  most  \A\  <  n  articulation  nodes  will  be  evaluated,  the  run 
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Algorithm  4:  Max-Cas  Greedy  Algorithm 

Input:  Interdependent  system  3(GS,  Gc,  Esc),  an  integer  k 
Output:  Set  of  k  critical  nodes  in  T  S  Vs 
T  <—  0 

for  i  =  1  to  k  do 

As ,  Ac  <—  set  of  articulation  nodes  of  Gs  and  Gcrespectively 
0<-{u£  Vs\u  &  ASV  ((u,v)  &  Esc  Are  Ac)} 

if  A  ^  0  then 

u  <—  argmaxugj4  Max-Cas(Gs,  T,  {it}) 

T  <—  T  U  {«} 

else 

u  <—  any  node  in  14  \  T 

end  if 

Update  Cl  [Vs  \  T] 

end  for 

Return  T 


time  is  0(kn 2)  in  the  worst  case.  In  practice,  the  actual  run  time  is  much  less  due  to  the  small  size 
of  A,  which  is  shown  in  our  experiment. 

For  the  new  centrality  measure  in  interdependent  networks,  this  measure  is  required  to  capture 
both  the  intra-centrality  (the  centrality  of  nodes  in  each  networks)  and  inter-centrality  (the  cen¬ 
trality  formed  by  the  interconnections  between  two  networks).  Given  an  interdependent  system 
X(GS,  Gc,  Esc ),  node  u  e  Vs  is  more  likely  to  be  critical  if  its  coupled  node  v  G  Gc  is  critical. 
Furthermore,  when  node  u  is  considered  as  a  critical  node,  its  neighbors  are  also  more  likely  to 
become  important  since  the  failures  of  these  nodes  can  cause  u  failure.  That  said,  the  criticality 
of  these  nodes  imply  the  criticality  of  their  coupled  nodes.  To  capture  this  complicated  relation  in 
interdependent  systems,  we  develop  an  iterative  method  to  compute  the  centrality  of  nodes,  called 
Iterative  Interdependent  Centrality  (IIC).  Initially,  the  centralities  of  all  nodes  in  Gs  are  computed 
by  the  traditional  centrality,  e.g.,  degree  centrality,  betweenness  centrality,  etc.  After  that,  these 
centralities  of  nodes  in  Gs  are  reflected  to  coupled  nodes  in  Gc  and  the  centralities  of  nodes  in  Gc 
are  updated  based  on  the  reflected  values.  The  centralities  of  nodes  in  Gc  continue  to  be  reflected 
on  nodes  of  Gs  and  update  centralities  of  these  nodes.  Two  key  points  of  IIC  are  the  updating 
function  and  the  convergence. 

The  updating  function  is  defined  as  follows: 

_ ,  .  .  \  .  .  ^ — >  w(v^) 

C(u )  =  aw(u )  +  (1  —  a)  — 

v:(u,v)£E  V 

where  w(-)  is  the  centralities  of  nodes  and  the  reservation  factor  a  lying  in  the  interval  [0,1].  The 
underlying  reason  we  use  centrality-based  degree  is  that  a  node  is  usually  more  critical  if  most 
of  its  neighbors  are  critical  nodes.  We  can  easily  modify  this  function  to  cope  with  the  weighted 
graphs.  This  centrality  can  be  computed  based  on  matrix  multiplications  and  we  have  proved  its 
convergence  in  our  paper  [32],  which  was  published  on  the  IEEE  Transactions  on  Smart  Grid. 

k  Interdependent  Networks.  In  order  to  solve  the  above  IPND  problem  in  k  interdependent 
networks,  we  devised  an  interdependent  centrality  method.  This  method  is  not  only  used  to  solve 
this  problem,  but  also  can  be  applied  for  other  problems  which  requires  centrality.  Note  that  many 
centrality  methods  have  been  developed  for  individual  networks.  But  none  has  been  studied  for  the 
k  interdependent  networks.  The  two  major  challenges  of  this  problem  are  to  measure  the  centrality 
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of  a  node  with  respect  to  its  own  networks,  and  then  to  other  networks.  Furthermore,  the  iteratively 
updating  must  converge.  To  this  end,  we  have  developed  the  following  method. 

When  two  nodes  are  coupled,  the  failure  of  one  node  is  equivalent  to  the  failure  of  the  other 
node,  thus  both  of  them  should  have  the  same  centrality  value.  There  are  two  kinds  of  nodes: 
(1)  nodes  have  no  coupled  nodes  in  other  networks  and  (2)  nodes  have  at  least  one  coupled  node 
in  other  networks.  The  centrality  of  the  former  type  of  node  should  depend  only  on  the  network 
structure.  On  the  other  hand,  the  centrality  of  the  later  type  of  node  should  depend  on  neighbor 
nodes  in  multiple  networks.  The  key  idea  to  design  the  new  centrality  is  to  decompose  the  centrality 
of  the  coupled  nodes  into  different  component  corresponding  to  different  networks. 

If  u  belongs  to  only  network  G\  then: 


C(u) 


v:(u,v)gE'1 


C(v) 


If  u11, ,  ulp  are  p  coupled  nodes  of  networks  6"' , . . . ,  G!" ,  then  the  centrality  of  these  nodes 
will  be: 


C(uh )  =  . . .  =  C(uip)  = 


3- 

l^t= l  rn  t=i 


-Eft 


1  1 

-C  (uH)  +  - 
2  K  2 


E 


v:(ult  ,v)EExt 


C(v) 

dv 


=  y^-r  E(ft.aC(«)  +  I1  -  a)  Y. 

^Ei=l  **  t=  1  v:(uit,v)eEH  V 

The  parameter  /3it  shows  the  fragility  of  the  network  GH.  If  network  G"  is  very  fragile  e.g. 
very  sparse,  the  value  of  /3it  is  large. 

The  remaining  key  point  for  this  method  is  to  prove  its  convergence.  We  proved  it  via  the 
following  steps.  (We  include  the  sketch  of  proofs  here  as  the  paper  is  being  written  and  thus  is  not 
available  at  this  time) 


Definition  5  The  associated  graph  of  a  nonnegative  square  matrix  A  is  a  directed  graph  Ga  of  n 
vertices,  where  n  is  the  size  of  A.  G  has  edge  from  vertex  i  to  vertex  j  when  AtJ  >  0. 

Lemma  2  The  nonnegative  matrix  A  is  irreducible  if  and  only  if  its  associated  graph  Ga  is 
strongly  connected. 


Lemma  3  The  nonnegative  matrix  A  is  aperiodic  if  and  only  if  the  greatest  common  divisor  of  the 
lengths  of  the  closed  directed  paths  in  its  associated  graph  Ga  is  1. 

Theorem  1  (Perron-Frobenius  theorem)  If  the  nonnegative  matrix  A  is  irreducible  and  aperi¬ 
odic,  then  there  exists  a  positive  real  number  r  such  that  r  is  an  eigenvalue  of  A  and  any  other 
eigenvalue  A  is  strictly  smaller  than  r  in  absolute  value,  |A|  <  r. 

Theorem  2  (Perron-Frobenius  theorem)  If  the  nonnegative  matrix  A  is  irreducible  and  aperi¬ 
odic,  then  there  exists  a  positive  real  number  r  such  that  r  is  an  eigenvalue  of  A  and  any  other 
eigenvalue  A  is  strictly  smaller  than  r  in  absolute  value,  |A|  <  r. 
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t_  MkMk~1  ...M1xt~1  _  Mxl~l 
~  \MkMk~1 . . .  M1xt~1\  ~  | Mx^l 

The  metric  will  converge  if  M  =  MkMk~ 1 . . .  M 1 

{a  if  u  —  v 

(!-«)/< 

0  otherwise 

The  uniqueness  of  the  centrality  vector  x1  will  converge  to  a  unique  vector  x  regardless  of  the 
initial  vector  x°. 

Let  vi,V2,  ■■■ ,  vn  be  the  basis  of  eigenvectors  of  M,  we  have: 


Mvi  =  A  iVi 

x°  can  be  represented  by  basis  as:  x°  =  JA 
Then: 


Mtx°  =  clMtvl  =  ^2  d\ \vi 

%  i 


x[u }  =  ax[u]  +  (1  —  a) 

v:(u,v)£E  V 


2.1.4  Detailed  Results  for  Cascading  Failures 

We  re-investigated  the  above  problems  in  the  presence  of  cascading.  Not  only  can  the  failures  be 
cascaded,  but  the  attack  itself  can  also  be  propagated.  For  example,  chemical  agents  can  spread 
through  a  network  (e.g.,  water),  viruses  spreading  over  computer  networks,  or  even  deceptive  mes¬ 
sages  influence  over  online  social  networks.  Thus  in  this  task,  we  further  exploited  the  network 
vulnerability  considering  these  type  of  attacks,  thereby  providing  a  much  more  effective  defense 
strategy.  We  focused  on  finding  the  minimum  number  of  nodes  such  that  if  these  nodes  are  at¬ 
tacked,  the  attack  can  spread  into  the  whole  network  and  create  the  greatest  damage.  Therefore, 
these  nodes  are  the  most  vulnerable  to  this  type  of  attack.  As  different  attacks  have  their  own  prop¬ 
agation  model,  we  investigated  this  problem  on  various  propagation  models.  In  addition,  the  attack 
cannot  spread  infinitely,  thus  we  set  the  time  constraint  on  the  spreading.  Along  this  direction,  we 
formulated  the  following  new  optimization  problems  and  studied  its  inapproximability  along  with 
its  approximation  algorithms. 

Definition  6  (Critical  Spreading  Nodes  Identification  (CSNI).)  Given  a  complex  system  S,  a 
latency  bound  d,  and  a  propagation  model  P,find  a  minimum  subset  nodes  N  such  that  by  launch¬ 
ing  attacks  at  N  initicdly,  cdl  other  nodes  in  S  will  be  “infected”  within  at  most  d  hops  under 
P. 
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First,  we  considered  P  as  a  linear  threshold  model.  In  this  model,  a  node  v  is  infected  (or 
refer  as  activated)  iff  there  are  more  than  pvd(y)  neighbors  of  v  active,  where  d(v)  denotes  the 
degree  of  node  v  and  pv  is  some  input  threshold.  Once  v  is  active,  v  can  continue  in  the  process  of 
spreading  the  influence  and  infect  other  nodes.  The  spread  will  continue  after  d  hops.  In  [22,  35], 
we  have  shown  that  when  d  =  1,  CSNI  cannot  be  approximated  within  In  A  —  0(lnln  A)  unless 
P  =  NP  where  A  is  the  maximum  degree  of  G.  We  also  showed  that  it  cannot  be  approximated 
within  In  B  —  0(ln  In  B )  with  degrees  bounded  by  B ,  and  cannot  be  within  (1/2  —  e)  In  n,  unless 
NP  e  DT I M  E  (rio(loglogriL) .  When  1  <  d  <  4,  the  problem  cannot  be  approximated  within 
O(lnZ)  and  for  d  >  4,  it  cannot  be  approximated  within  2log  "  n.  All  detailed  proofs  can  be  found 
in  [22,  35], 

For  d  =  1,  we  developed  a  tight  approximation  algorithm  with  ratio  //  (  ( 1  +  p)  A)  where  H(n) 
is  the  Harmonic  function.  In  general  case  where  d  <  4,  we  devised  a  greedy  algorithm  with  a  tight 
log  n  ratio  in  the  power-law  graphs.  The  algorithm  can  be  run  in  a  very  large  network,  consisting 
of  millions  nodes  and  edges  [22].  The  algorithm  can  be  modified  to  cope  with  a  case  that  we  just 
want  to  infect  a  certain  portion  of  the  network,  not  as  the  whole. 


Effective  Defense  Strategies  to  Dynamic  Attacks  with  Cascading  Failures.  We  investigated 
the  serial  attack  points  in  order  to  maximize  the  number  of  failed  nodes  after  the  cascading  failures. 
These  attack  points  are  considered  as  the  most  vulnerable  nodes  which  need  to  be  protected  during 
the  attacks.  We  provided  theoretical  analysis,  including  inapproximalblity  results,  and  algorithmic 
solutions  for  this  problem. 

Due  to  the  cascading  failures,  the  failures  of  a  small  set  of  nodes  S  can  result  in  a  catastrophic 
number  of  failed  nodes.  Therefore,  these  nodes  in  set  S  become  the  most  critical  nodes.  Addi¬ 
tionally,  the  order  of  these  nodes  to  be  destroyed  can  lead  to  different  outcomes.  In  this  study,  we 
considered  a  scenario  in  which  attacks  can  be  launched  after  another,  that  being  said,  the  nodes  in  S 
are  destroyed  in  a  certain  order  to  obtain  the  maximum  malfunction  of  the  network.  Furthermore, 
nodes  are  failed  in  the  cascading  manner  due  to  the  load  redistribution  of  failed  nodes,  called  the 
Load  Redistribution  model  (LR-model).  In  this  model,  a  set  of  nodes  S  are  failed  initially,  then  the 
failures  are  propagated  to  other  nodes  in  time  steps.  When  node  u  fails,  its  load  is  redistributed  to 
its  neighbors  and  each  alive  neighbor  will  received  an  additional  load  which  is  proportional  to  its 
weight.  Precisely,  each  neighbor  v  of  u  will  receive  the  following  additional  load: 


A L(v)  =  L(u)  x 


w(v) 

Ei6*+  w(0 


Due  to  the  load  redistribution,  the  load  of  some  nodes  are  exceeding  their  capacities,  hence  fail 
in  the  next  time  step.  The  process  of  load  redistribution  and  node  failing  will  stop  when  there  are 
no  more  failed  nodes.  The  set  of  failed  nodes  caused  by  the  initial  failure  of  S  is  denoted  by  F(S). 

In  particular,  given  an  order  set  S  —  {si,  s2,  •  •  • ,  -A-},  the  set  of  failed  nodes  after  s*  is  attacked 
is  FfS)  =  F(Fi-i(S)  U  {sj}).  Denote  F+(S )  as  Fk(S),  the  set  of  failed  nodes  when  nodes  in  S 
is  attacked  serially.  We  formally  define  the  problem  as  follows. 


Definition  7  (Cascading  Critical  Node  Problem  (Cas-CNP))  Given  a  network  G  =  (V,  E )  and 

an  integer  k,  the  problem  asks  to  find  a  ordered  subset  S  C  V  of  size  ,S)  =  k  such  that  the  serial 
failures  of  nodes  in  S  maximizes  the  number  of  failed  nodes  F+(S )  under  the  LR-model. 
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Our  major  findings  of  this  task  are  two  fold:  (1)  Showed  the  0(n1_£)-inapproximability  of 
theCas-CNP;  (2)  Provided  the  cascading  potential  metric  to  solve  the  problem. 

In  details,  our  results  are  described  as  follows. 

Theorem  3  It  is  NP-hard  to  approximate  the  CasCN  problem  within  ratio  of  0(n1_e)  for  any 
constant  1  >  e  >  0. 

We  use  the  gap-introduction  reduction  to  prove  the  inapproximability  of  the  CasCN  problem, 
using  a  polynomial  time  reduction  from  a  restricted  variant  MIN3SC2  of  the  Set  Cover  problem. 
The  proof  can  be  found  in  [3]. 

Cascading  Potential  Metric.  In  the  dynamic  attacks  (attacks  can  be  launched  after  the  other), 
we  need  to  consider  the  co-impact  of  attacks  to  obtain  a  large  number  of  failed  nodes.  Therefore, 
we  developed  the  cascading  potential  of  node  u,  denoted  by  C(u)  as  follows: 

c{  \  ^(M)!  y-F({u})  ALu(v) 

U  «  Zvev-F«u})(C(v)-L(v)) 

where  F({v})  is  the  set  of  failed  nodes  when  u  fails  and  A Lu(y)  is  the  additional  load  that  v 
receives  due  to  the  failure  of  u. 

This  metric  helps  to  evaluate  the  importance  of  nodes  wrt  different  attack  scenarios.  Based  on 
this  metric,  we  developed  the  following  Algorithm  5  to  the  Cas-CNP  problem. 


Algorithm  5:  Cascading  Potential  Algorithm 

1:  Input:  A  network  G  =  ( V. ,  E),  an  integer  k  and  parameter  a. 

2:  Output:  A  set  S'  of  A:  attacked  nodes. 

3:  Compute  the  centrality  of  all  nodes 

4:  Sort  nodes  in  non-increasing  order  of  centrality  C(ui)  >  C(uf)  >  . . .  >  C{un) 
5:  Initialize  S  <—  0 
6:  j  <—  1 

7:  for  %  =  1  to  k  do 
8:  while  Uj  e  F+(S)  do 

9:  j  <-  j  +  1 

10:  end  while 

11:  S  i —  S  U  {uj} 

12:  end  for 
13:  Return  S 


Cooperative  Attacks.  The  above  strategies  focus  on  selecting  nodes  to  maximize  the  number  of 
new  failed  nodes  and  redistributed  load.  However,  if  nodes  have  a  high  tolerance  factor,  they  can 
stand  still  under  the  additional  load  redistributed  from  attacked  nodes.  As  a  consequence,  the  load 
of  many  nodes  is  increased,  but  there  is  no  or  only  a  few  more  nodes  failed.  In  this  case,  the  total 
number  of  failed  nodes  at  the  end  of  the  cascading  process  is  very  low.  Therefore,  we  developed 
another  solution  to  overcome  this  challenging. 
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Since  a  node  u  will  fail  when  its  load  passes  its  capacity,  we  can  use  the  difference  C ( u )  —  L(u ) 
as  the  health  of  u.  The  weaker  u  is,  the  more  damage  it  receives  under  the  same  amount  of  attack 
A L.  Thus,  the  preference  of  attacking  u  can  be  defined  as: 

(  An  /  c(u)-l(u)  if^L  +  L(u)<C(u) 

7 (u,  A L)  —  , 

II  otherwise 

where  0  <  a  <  1  is  a  tunable  parameter. 

The  benefit  of  this  attack  preference  function  is  that  the  more  wounded  a  node  is,  the  higher 
attack  preference  it  has.  This  property  is  stated  in  Lemma  4. 

Lemma  4  For  any  node  u  at  two  points  of  time,  if  u  is  more  wounded  at  the  second  point,  i.e., 
L2(u)  >  L\  (u),  then  the  attack  preference  under  the  same  attack  A L  is  higher  at  the  second 
point:  72  (u)  >  71  (u). 

In  such  a  cooperative  attack,  we  defined  the  efficiency  of  selecting  the  next  v  to  be  destroyed. 
If  the  load  redistributed  from  v  to  u  is  A L(u),  the  efficiency  of  v  in  taking  down  u  is: 

A  (v,u)  =7(m,AL(m))(t(L(«)) 

The  overall  efficiency  of  v  is  total: 

A(v,«)=  Kviu) 

uGV\v 

The  efficiency  function  has  the  following  properties: 

Lemma  5  Given  a  fixed  load  L{u )  and  attack  A  L  with  L{u )  +  A  L  <  C(u),  the  efficiency  on  u  is 
monotone  decreasing  and  goes  to  0  when  the  capacity  C (it)  increases  and  goes  to  infinity. 

Lemma  6  Suppose  that  the  capacity  C[u )  is  linear  to  the  load  C(u )  =  T  *  L(u )  with  constant 
factor  T.  Then,  given  a  fixed  attack  A  L  with  L[u )  +  A  L  <  C(u),  the  efficiency  on  u  is  monotone 
non-increasing  and  goes  to  0  when  the  load  L[u )  increases  and  goes  to  infinity. 

Based  on  the  efficiency  evaluation,  we  developed  the  Cooperating  Attack  (CA)  algorithm  as 
shown  Algorithm  6. 

Identification  of  Critical  Nodes  with  Threshold  Cascading  Failure.  In  this  study,  we  inves¬ 
tigated  the  vulnerability  of  networks  under  simultaneous  attacks  with  cascading  failures.  More 
specifically,  we  identified  k  most  critical  subsets  of  a  network  whose  simultaneous  removal  will 
minimize  the  total  pairwise  connectivity  of  the  remaining  network  under  the  cascading  failure. 
That  being  said,  once  a  node  v  has  more  than  pv  neighbor  nodes  failed,  v  will  fail  and  this  failure 
will  be  propagated  further.  We  devised  an  Integer  Programming  (IP)  with  sparse  metric  to  ob¬ 
tain  the  optimal  solution  for  networks  with  couple  thousands  of  nodes.  For  larger  networks,  we 
proved  the  inapproximability  and  designed  a  near-optimal  solution.  Furthermore,  we  developed  a 
centrality  method  to  identify  critical  nodes  in  A-intcrdepcndcnt  networks. 

We  focused  on  finding  critical  nodes  in  (i)  an  individual  networks  with  threshold  cascading 
failure  and  (ii)  k  interdependent  networks.  More  specifically,  we  have  investigated  the  following 
Cascading  Critical  Node  Detection  (CCND)  problem: 
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Algorithm  6:  Cooperating  Attack  (CA) 


1:  Input:  A  network  G  =  (  V.  E )  and  an  integer  k. 

2:  Output:  A  set  S  of  k  seed  nodes. 

3:  Initialize  S  4—  0 
4:  for  %  —  1  to  k  do 

5:  Form  G,  as  the  network  after  the  failure  of  S  in  G 

6:  Evaluate  the  efficiency  of  all  node  in  Gi 

7:  Select  u  as  the  node  with  highest  efficiency 

8:  S'  4—  S'  U  {«} 

9:  end  for 

10:  Return  S 


Definition  8  Given  two  integers  k ,  d,  a  vector  of  fractional  numbers  with  size  n  where  each  9U  e 
(0, 1)  and  an  undirected  graph  G  =  (V.  E ).  Let  P(S)  be  total  pairwise  connectivity  of  residual 
graph  G  after  the  d-hop  cascading  failures  caused  by  the  initial  removed  of  the  set  of  nodes  S  £  V. 
The  CVND  problem  asks  for  k  most  vulnerable  nodes  such  that  P(S)  is  minimized. 


For  the  cascading  failure,  we  considered  the  threshold  model  in  which  each  node  u  in  the 
network  has  a  threshold  9U  e  [0, 1],  typically  drawn  from  some  probability  distribution.  Starting 
with  an  initial  set  of  failure  nodes  F0,  the  dynamics  of  failure  cascades  unfold  round  by  round 
as  follows.  The  cascading  process  is  deterministically  in  discrete  rounds:  in  round  t,  all  nodes 
that  failed  in  round  t  —  1  remain  failed,  and  another  node  v  fails  if  the  total  number  of  its  failure 
neighbors  is  at  least  9U,  i.e.,  | N(u)  (T  Ft-i\  >  9udeg(u),  in  which  Ft-\  is  the  set  of  failure  nodes 
before  round  t  —  1. 

/  (H — j  f — -)2(n— k) . 

We  first  proved  it  is  NP-hard  to  be  approximated  into  — - — f- - j  where  n  is  the  size 

of  the  network,  which  makes  it  unrealistic  for  one  to  quickly  obtain  optimal  solutions  within  the 
time  constraint.  To  this  end,  we  proposed  TRGA,  an  iterative  2-phase  algorithm  to  effectively 
solve  these  problems  in  a  timely  manner.  In  a  big  picture,  TRGA  algorithm  detects  the  ultimate 
failure  nodes  after  cascading  failures  and  traces  back  to  the  critical  nodes  in  each  iteration,  and 
terminates  until  k  critical  nodes  are  detected.  TRGA  algorithm  also  takes  into  account  the  local 
search,  constraint  pruning  and  lazy-update  techniques  in  order  to  further  improve  its  efficiency  and 
shorten  its  processing  time.  In  addition,  we  formulated  the  mathematical  programming  to  achieve 
its  optimal  solution  and  applied  a  sparse  metric  technique  to  reduce  the  number  of  constraints.  The 
performance  of  TRGA  algorithm  was  validated  on  both  real-world  and  synthetic  networks  with 
different  topologies. 

More  specifically,  we  proved  the  following  theorem: 


Theorem  4  Assuming  P  f  NP,  the  CVND  problem  is  NP-hard  to  be  approximated  within 

/(!+  )2(n-fc), 

Q( - - - 1 - )  for  any  e  <  1  —  logn  2  on  general  graphs. 


The  detailed  proofs  can  be  found  in  [1]. 

Based  on  the  above  inapproximability  result,  we  provided  the  following  IP  with  sparse  metric 
technique  to  handle  small  networks  with  size  of  thousands  nodes. 
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For  each  node  i  G  V  and  all  integers  t  G  [0,  d],  we  define 

1,  if  node  i  fails  in  round  t 
0,  otherwise 

Note  that  =  1  when  node  i  is  a  vulnerable  node  and  fails  at  the  beginning.  Then,  using  un¬ 
defined  as  above,  we  have  the  following  ILP: 

uij 

i,jev 

v?  +  Vj  +  Uij  >  1  V(i,  j)  G  E 

+  ujh  -uhi<  1  Vi,  j,  /igF 

iev 

u*"1  +  6  ■  degii^v^1 

jeNfa) 

>  9  ■  deg{vi)v\  Vi  G  V,  VO  <  t  <  d 

v\  >  u*"1  VO  <  t  <  d 

Vi  G  V,  0  <  f  <  d 

v\  G  {0, 1}  VO  <  t  <  d 

G  {0, 1} 

where  the  objective  is  to  minimize  the  total  pairwise  connectivity.  The  first  constraint  guarantees 
that  at  least  one  endpoint  of  a  link  has  to  be  deleted  after  d  round  cascades  if  its  two  endpoints  are 
disconnected  in  the  optimal  solution.  The  second  constraint  imposes  the  triangular  connectivity. 
That  is,  if  node  i  and  j  are  connected,  node  j  and  h  are  connected,  node  i  and  h  have  to  be 
connected.  The  third  constraint  means  that  the  total  pairwise  connectivity  after  d  round  failure 
cascades  is  at  most  (3  fraction  of  all  node-pairs.  The  last  two  constraints  deals  with  the  cascades 
process  and  keeps  failed  nodes  to  be  failure  in  the  following  rounds  respectively. 

For  the  large-scale  networks,  we  provided  a  2-phase  TRGA  algorithm  as  shown  in  Algorithm 
7. 

2.2  Network  Structural  Interdependency  and  Vulnerability  Assessment 

We  continued  to  investigate  the  network  vulnerability  from  the  network  structure  perspective. 
There  are  several  reasons  to  study  this  approach:  (i)  Changing  network  structures  will  change 
the  network  functions,  and  thus  may  break  down  the  system,  (ii)  The  changes  or  failures  occurred 
in  one  module  can  have  a  profound  impact  which  can  consequently  lead  to  the  transformation  of 
other  modules,  thus  requiring  us  to  understand  the  inter-  and  intra-interdependence  within  these 
modules. 

We  used  community  structure  to  approximate  the  network  structure.  To  accomplish  the  sec¬ 
ond  goal,  we  studied  the  following:  (1)  Prediction  models  based  on  network  modular  structures 
to  characterize  and  forecast  the  interdependent  responses  of  network  components  in  evolving  net¬ 
works  with  limited  data.  Including  the  dynamics  and  evolution  of  a  network  in  the  analysis  of 


min 

s.t. 
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Algorithm  7:  TRGA  Algorithm 

Input  :  Network  G,  Threshold  vector  d 
Output:  The  set  of  k  vulnerable  nodes  S 
t  k!  4-  k; 

2  S'  4—  0; 

3  while  |S|  <  k  do 

4  k' t—  fe  —  |S|; 

5  D  4—  k'  largest  degree  nodes  in  G\ V  \  S] ; 

6  4—  #failed  nodes  after  cascading  failures  by  removing  D  from  G  \ V  \  S] ; 

7  U  4 —  0; 

8  //  Ultimate  Failure  Nodes  Identification 

9  while  Pairwise  Connectivity  >  Ip  do 

to  Use  Constraint  Pruning  in  [?]  to  solve  the  LP  formulation  with  sp; 

it  4—  *p—  disconnected  node-pairs  after  removing  u; 

12  u  <—  the  node  with  largest  v* ; 

13  U  4—  U  U  {«}; 

14  G  4—  G\V  \  {«}]; 

is  end 

16  //  Critical  Nodes  Tracing  Back 

17  Q  4—  0;  //  Priority  Queue 

is  S'  4-  0; 

19  while  3  one  node  does  not  fail  do 

20  if  Q  =  0  then 

21  foreach  node  u  do 

22  |  Calculate  the  cascading  influence  after  removing  u  from  G; 

23  end 

24  Construct  Q  based  on  cascading  influence  of  each  node; 

25  end 

26  else 

27  S'  <—  S' U  the  node  in  Q  with  max  priority; 

28  Update  cascading  influence  caused  by  removing  this  node; 

29  end 

30  end 

31  if  |  S'  |  >  k'  then 

32  |  S  <—  SU  k '  largest  degree  nodes  in  G[V  \  £]; 

33  end 

34  else 

35  |  St-SuS'; 

36  end 

37  end 

38  //  Local  Search 

39  S*  t-  S; 

40  foreach  node  «  S  S'  do 

41  |  Swapping(u); 

42  end 

43  S  4-  S*; 

44  return  S; 

inherent  modules  is  a  new  task  that  has  not  yet  been  well  investigated.  The  study  of  such  evolution 
requires  computing  modules  of  the  network  at  different  time  instances.  However,  identifying  net¬ 
work  modules  in  each  state  of  the  network  from  scratch  may  result  in  prohibitive  computational 
costs,  particularly  in  the  case  of  highly  dynamic  networks.  In  addition,  it  may  be  infeasible  in  the 
case  of  limited  topological  data.  In  this  regard,  the  study  requires  us  to  solve  many  interesting 
problems  such  as:  (1)  How  to  devise  new  measures  and  methods  for  computing  network  modules 
which  are  robust  to  changes,  (2)  How  to  easily  update  the  modules  once  the  changes  occur  without 
re-calculating  them  from  scratch,  (3)  How  sensitive  the  community  structure  is  with  respect  to  the 
failues  of  nodes  and  edges. 
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Relevance.  This  study  helps  us  understand  the  interdependencies  of  network  components  and 
provide  a  robust  solution  for  many  applications  which  are  sensitive  to  the  structure  of  network 
communities.  This  study  also  addresses  the  improvement  of  network  robustness  with  minimum 
additional  costs  so  that  the  network  modules  are  less  sensitive  to  changes,  thus  providing  a  cost- 
effective  protection  scheme. 

The  knowledge  about  this  crucial  vulnerability  of  network  community  structure  is  not  only 
helping  to  understand  the  network  interdependencies  but  also  of  considerable  usages,  particularly 
having  many  applications  for  social-aware  methods  in  mobile  ad-hoc  and  online  social  networks 
(OSNs).  For  instance,  since  social-based  forwarding  and  routing  strategies  in  Delay  Tolerant  Net¬ 
works  rely  heavily  on  the  highest  ranking  node  in  each  community  to  forward  the  message,  the 
awareness  of  this  vulnerability  can  help  to  design  either  a  routing  algorithm  that  do  not  overload 
those  crucial  nodes,  or  to  design  an  effective  backup  plan  when  some  of  them  may  fail  at  the  same 
time. 

Summary  of  Findings. 

•  Provided  the  first  constant  approximation  algorithm  to  find  community  structure  in  power-law 
networks  and  trees  with  performance  guarantee.  This  is  important  first  step  to  predict  the 
changes  of  community  structure. 

•  Provided  the  first  adaptive  approximation  algorithms  for  both  disjoint  and  overlapping  commu¬ 
nity  structure.  The  adaptive  solutions  help  us  update  the  community  structure  during  the  changes 
in  a  very  short  amount  of  time. 

•  Assessed  the  network  structure  vulnerability  during  attacks,  both  at  edges  and  nodes.  Showed 
that  community  structure  is  not  as  strong  as  we  think. 

2.2.1  Detailed  Results  for  Identifying  Community  Structures 

Consider  a  network  represented  as  an  undirected  graph  G  =  (V.  E )  consisting  of  n  —  \  V\  vertices 
and  m  =  \E\  edges.  The  adjacency  matrix  of  G  is  denoted  by  A  =  {A,3 ) ,  where  AVJ  is  the  weight 

of  edge  (i,j)  and  AtJ  =  0  if  (i,j)  (j  E.  We  also  denote  the  (weighted)  degree  of  vertex  i,  the  total 

weights  of  edges  incident  at  i,  by  deg(i)  or,  in  short,  dj. 

Community  structure  (CS)  is  a  division  of  the  vertices  in  V  into  a  collection  of  disjoint  subsets 
of  vertices  C  =  {C'i,  C2, . . . ,  Ci}  (with  unspecified  /)  where  1J-=1  Ct  =  V.  Each  subset  C*  C  V  is 
called  a  community  and  we  wish  to  have  more  edges  connecting  vertices  in  the  same  communities 
than  edges  that  connect  vertices  in  different  communities.  The  modularity  of  C  is  the  fraction  of 
the  edges  that  fall  within  the  given  communities  minus  the  expected  number  of  such  fraction  if 
edges  were  distributed  at  random.  The  randomization  of  the  edges  is  done  so  as  to  preserve  the 
degree  of  each  vertex.  If  vertices  i  and  j  have  degrees  d,  and  d3,  then  the  expected  number  of  edges 
between  i  and  j  is  Thus,  the  modularity,  denoted  by  (X  is  then 

Q(C)  =  —  V(Ai?  -  (li<lj  )8ij  (14) 

>  2 M  J  2M  J 

*3 

where  M  is  the  total  edge  weights  and  the  element  dtJ  of  the  membership  matrix  5  is  defined  as 
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sa 


1,  if  i  and  j  are  in  the  same  community 
0,  otherwise 


The  modularity  values  can  be  either  positive  or  negative  and  the  higher  (positive)  modularity 
values  indicate  stronger  community  structures.  Therefore,  the  maximizing  modularity  problem 
asks  us  to  find  a  division  C  which  maximizes  the  modularity  value  Q(C). 

This  problem  is  different  from  the  partition  problem  as  we  do  not  know  the  total  number  of 
partitions  beforehand.  That  being  said,  l  is  unspecified.  Somewhat  surprisingly,  modularity  maxi¬ 
mization  is  still  NP-complete  on  trees,  one  of  the  simplest  graph  classes. 


Theorem  5  Modularity  maximization  on  trees  is  NP-complete. 


The  proof  has  been  presented  in  [23],  reducing  from  the  Subset-Sum  problem. 


Exact  Solutions.  Although  the  problem  is  in  NP  class,  efficient  algorithms  to  obtain  optimal 
solutions  for  small  size  networks  are  still  of  interest.  We  have  presented  an  exact  algorithm  with 
a  run  time  of  0(n5)  to  the  problem  on  uniform-weighted  trees  [23].  The  algorithm  is  based  on 
the  dynamic  programming,  which  exploits  the  relationship  between  maximizing  modularity  and 
minimizing  the  sum-of-squares  of  component  volumes,  where  volume  of  a  component  S  is  defined 
as  vol^)  =  Evesdv 

When  the  input  graph  is  not  a  tree,  we  provided  an  exact  solution  based  on  Integer  Linear 
Programming  (ILP)  [23].  Note  that  in  the  ILP  for  modularity  maximization,  there  is  a  triangle 
inequality  +  xjk  —  xik  >  0  to  guarantee  the  values  of  xi:j  be  consistent  to  each  other.  Here 
Xij  —  0  if  i  and  j  are  in  the  same  community;  otherwise  x.i3  =  0.  Therefore,  the  ILP  has  3  ('j)  = 
0 ( n3 )  constraints,  which  is  about  half  a  million  constraints  for  a  network  of  100  vertices.  As  a 
consequence,  the  sizes  of  solved  instances  were  limited  to  few  hundred  nodes.  Along  this  direction, 
we  have  presented  a  sparse  metric,  which  reduces  the  number  of  constraints  to  0(n2)  in  sparse 
networks  where  m  =  O(n). 

Approximation  Algorithms.  When  G  is  a  tree,  the  problem  can  be  solved  by  a  polynomial  time 
approximation  scheme  (PTAS)  with  a  run  time  of  0(ne+1)  for  e  >  0  [23].  The  PTAS  is  solely  based 
on  the  following  observation.  Removing  k  —  1  edges  in  G  will  yields  k  connected  communities 
and  Qk  >  (1  —  k)Qopt  where  Op  is  the  maximum  modularity  of  a  community  structure  with  k 
communities,  and  Qopt  is  the  optimal  solution. 

When  G  having  the  degree  distribution  follows  the  power-law,  i.e.,  the  fraction  of  nodes  in 
the  network  having  k  degrees  is  proportional  to  A;-7,  where  1  <  7  <  4,  the  problem  can  be 
approximated  to  a  constant  factor  for  7  >  2  and  up  to  an  0(1/  log  n)  when  1  <  7  <  2  [19].  The 
details  of  this  algorithm,  namely  Low-Degree  Following  (LDF),  is  presented  below. 
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Algorithm.  Low-degree  Following  Algorithm  (Parameter  d0  £  N+)  12pt 

1.  L  :  =  0,  M  :=  0,  O  :=  0,p*  =  0  Vi  =  l..n 

2.  for  each  vertex  i  e  V  do 

3.  if  (A:*  <  d0)  &  (*  ^  L  U  M)  then 

4.  if  7V(i)  \  M  ±  0  then 

5.  Select  a  vertex  j  <E  N ( i )  \  M 

6.  Let  M  —  M  U  {(},  L  =  L  U  {_;},£«  =  j 

7.  else 

8.  Select  a  vertex  t  &  N(i) 

9.  0  =  0  U  {(}, Pi  =  t 

10.  £  =  0 

1 1.  for  each  vertex  i  eV  \  (. M  U  O)  do 

12.  Ci  =  {i}  U  {j  £  M  |  pj  =  i}  U  {t  £  O  |  =  i} 

13.  £  =  £  U  {C'i} 

14.  Return  £ 

The  selection  of  do  is  important  to  derive  the  approximation  factor  as  do  needs  to  be  a  sufficient 
large  constant  that  is  still  relative  small  to  n  when  n  tends  to  infinity.  In  an  actual  implementation 
of  the  algorithm,  we  have  designed  an  automatic  selection  of  d0  to  maximize  ().  LDF  can  be 
extended  to  solve  the  problem  in  directed  graphs  [19]. 

Furthermore,  in  some  cases,  communities  are  sharing  some  nodes  between  them,  referred  as 
overlapping  communities.  That  is,  a  person  or  a  node  can  belong  to  more  than  one  community. 
Therefore,  we  further  designed  an  algorithm  to  find  overlapping  network  modules  which  required 
only  one  parameter,  indicating  the  level  of  overlapping.  Simulations  showed  that  this  is  the  best 
one  in  the  literature.  This  work  is  published  in  the  IEEE  Conference  on  Social  Computing,  2011. 

2.2.2  Detailed  Results  for  Adaptively  Updating  Community  Structures 

We  continued  studying  the  adaptive  identification  of  community  structures,  focused  on  the  follow¬ 
ing  question:  How  to  update  the  evolving  community  structures  without  re-computing  it.  In  this 
approach,  the  community  structure  (CS)  at  time  t  is  detected  based  on  the  community  structure  at 
time  t  —  1  and  the  changes  in  the  network,  instead  of  recomputing  it  directly  at  time  t  without  taking 
advantages  of  a  current  solution  at  time  t  —  1.  Along  this  direction,  we  have  devised  an  adaptive 
approximation  algorithm  for  this  problem,  published  in  [1 1].  Indeed,  the  above  LDF  algorithm  can 
be  enhanced  to  cope  with  this  situation.  At  first  LDF  is  run  to  find  the  base  CS  at  time  0.  Then 
at  each  time  step,  we  adaptively  follow  and  unfollow  the  nodes  that  violate  the  condition  3  in  Alg 
LDF. 
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We  further  investigated  the  overlapping  community  structure  and  provided  the  first  adaptive 
algorithm  to  adaptively  updating  the  overlapping  network  modules.  This  work  is  published  in  [8, 
56], 

2.2.3  Detailed  Results  for  Assessing  Network  Structure  Vulnerability 

Impact  of  Nodes’  Failures  on  Network  Components.  In  this  task,  we  are  interested  in  identi¬ 
fying  the  set  of  nodes  whose  removal  triggers  a  significant  restruction  of  the  current  community 
structure.  In  term  of  notations,  given  the  input  network,  the  community  detection  algorithm  A 
and  a  positive  number  k,  we  formulated  the  Community  structure  Vulnerability  Assessment  (CVA) 
which  aims  to  find  a  set  S  of  k  nodes  whose  removal  maximally  transforms  the  current  network 
community  structure  to  a  totally  different  one,  evaluated  via  the  Normalized  Mutual  Information 
measure. 

Definition  9  Given  a  network  represented  by  an  undirected  and  unweighted  graph  G,  a  specific 
community  detection  algorithm  A,  and  a  positive  integer  k  <  N,  we  seek  for  a  subset  S  C  V  such 
that  S  =  argmin  {NMIX(S')},  where  X  =  A{G),  and  NMIX(S')  =  NMI(X,  A(G[V\S'])) 

S'CV,\S'\=k 
for  any  S'  C  V. 

Our  major  findings  of  this  tasks  are:  (1)  We  analyzed  conditions  that  can  possibly  lead  to  the 
minimization  of  NMI  on  community  structures.  (2)  We  devised  an  approximation  algorithm  for 
the  case  k  —  1,  and  suggested  multiple  heuristic  algorithms  for  CVA  problem.  We  validated  the 
effectiveness  of  our  solutions  on  both  synthesized  data  with  known  community  structures  and  real- 
world  traces  including  Arxiv  citation  network,  Facebook,  and  Foursquare  social  networks.  The 
details  can  be  found  in  [6] . 

We  have  provided  the  basic  results  for  the  MNI  analysis  as  follows: 

Lemma  7  There  is  a  graph  G  =  (V,  E )  in  which  N M I xi)  is  not  a  submodular  function.  More¬ 
over,  there  are  subsets  L  C  T  C  V  such  that  NMIxiT )  >  NMIx(L)  (where  L,T  are  sets  of 
removed  nodes). 

Theorem  6  Given  two  community  assignments  A  C  If  there  is  s  f  A.  B  such  that  NM IX(A  + 
x)  -  NMIx(A)  <  NMIX{B  +  s)  -  NMIX(B). 

We  provided  three  algorithms  to  find  a  subset  S.  Our  first  heuristic  algorithm  is  oriented  based 
on  the  modularity  contributions  of  network  communities  in  G.  There  are  two  versions  in  general, 
called  greedy Mn  and  greedyMc,  for  this  heuristic  approach  with  different  priorities  given  to 
nodes  and  communities.  In  greedy MN,  all  nodes  u’s  in  the  network  are  ranked  based  on  their 
modularity  contributions  g«,c’s,  and  the  top  k  nodes  are  selected  in  the  solution  set.  The  second 
algorithms,  greedyMc ,  consists  of  two  steps:  it  first  finds  the  community  C  having  the  most 
modularity  contribution  qc,  and  then  selects  a  node  u  that  has  the  highest  modularity  portion  qU)c 
in  C  until  k  nodes  are  included  in  the  solution  set. 

Our  second  heuristic  approach  greedy C  for  CVA  problem  is  based  on  the  component.  Basi¬ 
cally,  given  a  community  structure  X  and  the  algorithm  A,  greedyC  tries  to  find  nodes  which  can 
potentially  break  current  communities  into  smaller  ones  of  the  relatively  same  size,  where  the  pref¬ 
erence  given  to  large-size  communities.  In  particular,  greedyC  looks  into  communities  Xfs  of  X, 
ordered  by  their  sizes,  and  selects  nodes  that  can  divide  this  community  into  more  subcomponents. 
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Impact  of  Edges’  Failures  on  Network  Components.  In  this  task,  we  are  interested  in  identify¬ 
ing  the  set  of  edges  whose  removal  triggers  a  significant  reconstruction  of  the  current  community 
structure,  defined  as  follows: 

Definition  10  (DBC)  Given  an  undirected  graph  G  =  (  V.  E),  and  a  set  ho  of  k  communities, 
find  a  subset  S  C  E  of  minimum  cardinality  such  that  removing  S  from  the  graph  breaks  every 
community  in  ho. 

Our  major  findings  of  this  task  are: 

•  We  defined  the  framework  for  community  structure  fragility.  At  first  we  introduced  the  density 
based  broken  community  (DBC)  problem  for  breaking  k  communities  with  the  minimum  num¬ 
ber  of  edge  removals  and  provided  an  approximation  algorithm,  namely  CVA,  with  theoretical 
performance  guarantee,  0(log  k).  Its  pseudo-cde  is  shown  in  Algorithm  8. 

•  To  analyze  the  vulnerability  of  the  community  structures  in  a  broader  sense,  we  extended  the 
problem  formulation  to  communities  produced  from  an  arbitrary  community  detection  algo¬ 
rithm.  We  offered  an  efficient  heuristic  to  break  the  communities  and  identify  the  set  of  critical 
edges. 

•  We  conducted  extensive  experiments  with  different  parameters  to  mine  interesting  observations 
about  the  behavior  of  broken  communities  after  edge  removal. 

The  details  can  be  found  in  [2]. 

For  general  definition  of  community  structure,  we  extended  the  DBC  problem  to  the  following 
one: 

Definition  11  (Broken  Community)  Consider  a  community  detection  algorithm  sf ,  which  pro¬ 
duces  a  collection  ho  of  communities  on  graph  G  ( written  ho  =  r// (G) ).  Let  G'  be  a  new  graph 
after  removal  of  a  set  of  edges,  and  let  ho'  —  srf  (Gr).  Let  7  e  (0, 1).  A  community  C  G  ho  is  said 
to  be  broken  in  graph  G'  if  there  does  not  exist  a  community  6"  e  ho'  satisfying 
(i)  C'  C  C,  and  (ii)  |C"|/|Cj  >  7 

We  have  shown  that  partitioning  a  community  C  into  at  least  c  e-balanced  subparts,  where 
7c  >  1  +  e  makes  it  broken.  Therefore,  we  developed  the  following  Algorithm  9. 

2.3  Impact  Analysis  of  the  Power-Law  Degree  Distribution  on  Network  Vul¬ 
nerability 

Many  practical  complex  networks,  such  as  the  Internet  and  WWW  are  discovered  to  follow  power- 
law  distribution  in  their  degree  sequences,  i.e.,  the  number  of  nodes  with  degree  i  in  these  networks 
is  proportional  to  i~ 13  for  some  exponential  factor  (3  >  0.  Although  power-law  networks  have  been 
found  robust  under  random  attacks  and  vulnerable  to  intentional  attacks  via  experimental  obser¬ 
vations,  a  better  understand  of  their  vulnerabilities  from  a  theoretical  point  of  view  still  remains 
open. 

Furthermore,  it  is  a  common  belief  that  solving  solve  optimization  problems  in  power-law 
graphs  is  easier  that  than  in  general  graphs.  From  an  algorithmic  perceptive,  some  experiments 
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Algorithm  8:  CVA:  An  approximation  algorithm  for  finding  the  critical  edges 
Data:  Network  G  =  (V,  E),  DeletionVector  D,  ,  j(if\  =  k 
Result:  A  set  S  C  E  edges 

1  S  <-  0; 

2  C  i —  0; 

3  for  each  edge  e  G  A  do 

4  |  compute  the  gain  /(e); 

5  end 

6  while  \C\  <  k  do 

7  e!  G-  argmax{/(e)}; 

eS-E 

8  In  case  of  a  tie,  choose  randomly; 

9  5^SU{e'}; 

10  for  l  —  1  to  k  do 

u  if  C)  /  C  then 

12  if  e'  G  Ci  then 

13  Di  <—  Di  —  1; 

14  if  Di  <  0  then 

is  c<-cu{Ci}\ 

16  /(e)  =  /(e)  -  1  for  all  e  G  C,\ 

17  end 

is  end 

19  end 

20  end 

21  end 

22  return  S; 


Algorithm  9:  CCF:  A  heuristic  algorithm  for  breaking  communities 


Data:  Network  G  =  (V,  E),  k  Communities  c& ,  strictness  threshold  7 
Result:  A  set  S  C  E  of  edges 

1  Sg^; 

2  c  i —  £  :  £  is  least  integer  satisfying  ^7  >  1  +  e; 

3  for  each  community  6',.  G  rC  do 

4  compute  the  c- way  balanced  partitioning; 

5  Cuti  =  set  of  edges  to  cut  Ct  into  c  parts; 

6  S  <—  S  U  Cuti  ', 

1  end 

8  return  S; 


had  been  developed  to  evaluate  the  simple  algorithms  for  optimization  algorithm  in  power-law 
graphs.  However,  there  is  no  work  that  provides  an  algorithm  framework  for  solving  a  set  of 
problems  in  power-law  graphs  with  the  degree  distribution  property,  let  alone  a  theoretical  analysis 
framework  for  analyzing  approximation  ratios. 

Therefore,  we  focused  on  addressing  the  following  issues:  (i)  Analyzed  the  power-law  network 
vulnerability  under  various  attacks,  (ii)  Studied  the  complexity  of  many  optimization  problems, 
and  (iii)  Developed  approximation  algorithms  on  power-law  graphs. 

Relevance.  Clearly  this  study  is  very  important  in  understanding  of  and  providing  solutions  to 
network  vulnerability  in  real-life  as  many  of  them  follows  the  power-law  degree  distribution.  The 
results  in  this  task  advance  the  research  front  of  approximation  theory  and  optimization.  They  help 
us  develop  several  solutions  for  many  problems  on  PLNs,  and  thus  the  findings  in  this  task  are 
extremely  helpful  for  many  applications. 

Summary  of  Findings. 

•  Showed  that  the  power-law  networks  almost  surely  are  not  vulnerable  to  attacks,  including  ran¬ 
dom  and  preferential  attacks  when  (3  is  small  enough. 

•  Developed  a  new  embedding  technique  to  re  investigate  most  of  classic  optimization  problem 
on  power-law  networks  such  as  dominating  set,  maximum  independent  set,  vertex  cover,  clique. 
We  have  shown  these  problems  remain  NP-complete  on  power-law  networks  but  they  may  have 
better  (tighter)  approximation  ratios. 

•  Developed  an  approximation  algorithm  framework,  called  Low-Degree  Percolation  (LDP)  Al¬ 
gorithm  Framework,  for  solving  Minimum  Dominating  Set  (MDS),  Minimum  Vertex  Cover 
(MVC)  and  Maximum  Independent  Set  (MIS)  problems  in  power-law  networks. 

2.3.1  Detailed  Results  for  Vulnerability  Analysis 

In  this  task,  we  studied  the  vulnerability  of  power-law  networks  under  random  attacks  and  adver¬ 
sarial  attacks  using  the  in-depth  probabilistic  analysis  on  the  theory  of  random  power-law  graph 
models.  Our  results  indicate  that  power-law  networks  are  able  to  tolerate  random  failures  if  their 
exponential  factor  (3  is  less  than  2.9,  and  they  are  more  robust  against  intentional  attacks  if  (3  is 
smaller.  Furthermore,  we  revealed  the  best  range  [1.8,  2.5]  for  the  exponential  factor  f3  by  opti¬ 
mizing  the  complex  networks  in  terms  of  both  their  vulnerabilities  and  costs.  When  (3  <  1.8, 
the  network  maintenance  cost  is  very  expensive,  and  when  /3  >  2.5  the  network  robustness  is 
unpredictable  since  it  depends  on  the  specific  attacking  strategy. 

The  detailed  proofs  can  be  found  in  [37,  38].  Here,  we  listed  our  major  findings  in  terms  of  the 
theorems. 

Theorem  7  In  a  residual  graph  Gr  o[G(a  p)  after  random  failures, 

•  If  (3  <  (3 p,  the  expected  pairwise  connectivity  E  (P)  is  a.s.  @(n2); 

•  If  (3  >  f3p,  the  pairwise  connectivity  P  is  a.s.  at  most  \n  (cr(/3)nP  log  n  —  lY 
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where  /3P  satisfies  that  (1  —  p)((/3p  —  2)  —  (2  —  p)((/3p  —  1)  =  0  and 


Cr{P) 


16/ 


m  2 


p-  (i-p) 


CiP  -  2) 
C(/5  -  1) 


2 


Theorem  8  In  a  residual  graph  Gp  ofG^p)  after  interactive  preferential  attacks, 

•  If  (3  +  /3'  <  3.47875,  the  expected  pairwise  connectivity  E(  P)  is  @(n2); 

•  If  fi  +  /3'  >  3.47875,  the  pairwise  connectivity  P  is  a.s.  at  most  \n  (c((3)nP  logn  —  1 ). 


where  c(/3)  =  16/  |c(/3)  (2  - 


is  a  constant  on  any  given  j3. 


Theorem  9  In  a  residual  graph  Gp  ofG(a^  after  expected  preferential  attacks, 


The  pairwise  connectivity  P  is  a.s.  0 in2 ' 


y 


1  ))(J(1  1)2)) 


>1  ; 


z/c  <  min  <  c 


The  pairwise  connectivity  P  is  a.s.  at  most  |na  log  n 


z/c  >  max  <  c 


<(4-2)  \  C(4  2) 


e“C(/3-l)V  COa-ll-e^grij) 


<1  . 


Theorem  10  In  a  residual  graph  Gc  ofG^p)  after  degree-centrality  attacks, 

•  The  pairwise  connectivity  P  is  a.s.  @(n2) 

1  V~ 

1  \2^x=i^p=T) 


if  Xq  >  min  <  xq 


cc/3-i)  e :°=1  ^ 


>U; 


77? pairwise  connectivity  P  is  a.s.  at  most  \n 2  log  n 
z/xq  <  max  {xq  ^ry  Ex°=i  ^  <  l}. 


Lemma  8  Let  Gcp  be  the  residual  graph  ofG(a0)  only  consisting  of  the  protected  degree-centrality 
nodes  (the  nodes  of  degree  larger  than  x()),  we  have 


The  pairwise  connectivity  P  is  a.s.  ©(n2'1 

1  (Si=i0+i  jff-i )  v  1  1  , 
W-i)  Ex=xo+1  -fp  > 


if  Xq  <  max  <  :/;•() 


The  pairwise  connectivity  P  is  a.s.  at  most  /rJ  log  n 


if  x 0  >  min  <  xq 


_L _ 

(3-1)  L—/x=x 


C(/9— 1)  L^x=x0+1  <  1  (*• 


Theorem  11  In  the  residual  graph  Gs  ofG(ap),  the  expected  size  of  a  connected  component  c  is 


a.s.  upper  bounded  by  O In*  )  when  ds  <  1,  that  is,  x0  <  max  <  x0 


1 _ y^xo  1  <-  1  l 

1-1)  2^x=l  xP~2  ^  1  [• 


CCS— 1) 
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2.3.2  Detailed  Results  for  Hardness  Complexity 


In  this  task,  we  aimed  to  develop  new  embedding  techniques  for  many  classical  problems  to  inves¬ 
tigate  whether  the  are  remained  in  NP-complete  even  on  power-law  graphs. 

Embedding  techniques  allow  an  original  graph  G  to  embed  to  a  power-law  graph  Gp  such  that 
a  considered  problem  can  be  polynomially  solved  in  Gp  \  G,  thus  preserving  the  complexity  of  the 
problem  on  G  to  Gp.  We  developed  two  new  techniques  on  optimal  substructure  problems,  Cycle- 
Based  Embedding  Technique  and  Graphic  Embedding  Technique,  to  embed  a  ('/-bounded  graph 
into  a  general  power-law  graph  and  a  simple  power-law  graph  respectively.  Then  we  used  these 
two  techniques  to  further  proved  the  A/W-hardness  and  the  inapproximability  of  many  classical 
problems  such  as  Minimum  Dominating  Set  (MDS),  Maximum  Independent  Set  (MIS),  Minimum 
Dominating  Set  (MDS),  Minimum  Vertex  Cover  (MVC),  Clique,  Coloring,  p-Minimum  Dominat¬ 
ing  Set  (p-MDS)  on  general  power-law  graphs  and  simple  power-law  graphs.  These  inapproxima¬ 
bility  results  on  power-law  graphs  are  shown  in  Table  1 . 

Table  1 :  Inapproximability  Factors  on  Power-Law  Graphs  with  Exponential  Factor  (3  >  1 


Problem 

General  Power-Law  Graph 

Simple  Power-Law  Graph 

MIS 

1  4-  1  E 

1  '  140(2C(/3)3/?-l) 

1  T  1  £ 

1  '  1120C(P)3/3 

MDS 

1  1 

^  390(2C(/3)3^  — 1) 

1  J_  1 

T  3120C(/3)3'5 

MVC,  p-MDS 

x  |  2(l-(2+o:(l))M£) 

1  ,  2-(2+0c(l))Lgp 

fc(/8)c^+cf^(c+l) 

1  2C(P)c0(c+l) 

Clique 

- 

0  (rzW+O-6) 

Coloring 

- 

0  (n1/(/5i+1)_e) 

a  Conditions:  MIS  and  MDS:  P^NP;  MVC,  p-MDS:  unique  games  conjecture;  CLIQUE,  COLORING:  NP^ZPP. 


The  inapproximablity  results  show  that  it  is  easier  to  find  the  solution  for  these  problems  on 
PLNs  than  that  of  on  the  general  graphs.  The  details  of  cycle -based  embedding  techniques  and 
graphic  embedding  technique  can  be  found  in  [43,  45].  Here,  we  list  the  major  theorems: 

Theorem  12  (Cycle-Based  Embedding  Technique)  Any  d-bounded  graph  Gd  can  be  embedded 
into  a  power-law  graph  G(aj)  with  (3  >  1  such  that  Gd  is  a  maximal  component  and  most  optimal 
substructure  problems  can  be  polynomially  solvable  on  G(a^)  \  Gd. 

Lemma  9  Given  a  sequence  of  integers  D  =  (d\,  d2,  •  •  • ,  dn)  which  is  non-increasing,  continu¬ 
ous  and  the  number  of  elements  is  at  least  as  twice  as  the  largest  element  in  D,  i.e.  n  >  2d\, 
it  is  possible  to  construct  a  simple  graph  G  whose  d-degree  sequence  is  D  in  polynomial  time 
0(n2  log  n). 

Theorem  13  (Graphic  Embedding  Technique)  Any  d-bounded  graph  Gd  can  be  embedded  into 
a  simple  power-law  graph  G(ajp  with  (3  >  1  in  polynomial  time  such  that  Gd  is  a  maximal  compo¬ 
nent  and  the  number  of  vertices  in  G(a  p)  can  be  polynomially  bounded  by  the  number  of  vertices 
in  Gd. 
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2.3.3  Detailed  Results  for  Approximation  Algorithms 

Due  to  the  above  results,  we  continued  to  develop  a  general  approximation  framework,  called 
Low-Degree  Percolation  (LDP),  to  approximate  the  optimization  problems  in  power-law  networks, 
including  MIS,  MDS,  and  MVC  problems.  The  idea  of  LDP  framework  is  to  percolate  the  graph 
starting  from  a  large  number  of  low-degree  nodes  in  a  power-law  graph,  which  allows  us  to  develop 
a  theoretical  framework,  which  can  be  used  to  analysis  the  approximation  ratios  via  probabilistic 
analysis. 

In  particular,  we  applied  this  theoretical  framework  to  show  the  approximation  ratios  for  these 
problems  on  two  well-known  random  power-law  models  as  follows: 


Theorem  14  (Main  Theorem  (MDS&MVC))  In  a  power-law  graph  G(a^y  by  using  LDP  Algo¬ 
rithm,  MDS  and  MVC  can  be  approximated  into 

1  +  (tf  -  1)A 


with  probability  at  least  1  —  p\,  where  T  is  the  approximation  ratio  of  MDS  (or  MVC)  in  general 
graphs  w.r.t.  a  graph  of  size  at  most  e°  ~b- 


Theorem  15  (Main  Theorem  (MIS))  In  a  power-law  graph  G(ap),  by  using  LDP  Algorithm,  MIS 
can  be  approximated  into 

N  +  Mp  yLA) 

with  probability  at  least  1  —  p\,  where  N  is  the  number  of  nodes  with  degree  1,  T  is  the  approxi¬ 
mation  ratio  of  MIS  in  geneml  graphs  w.r.t.  a  graph  of  size  at  most  ea 


Theorem  16  In  a  Structured  Random  Power-Law  (SRPL)  Graph  G,  by  using  LDP  Algorithm,  MDS 
and  MVC  can  be  approximated  into 

1  +  (tf  -  1)A 

with  probability  at  least 

Le“/2j 

Y,  ^[C2  =  r](l-pJ) 

T  — 0 

where  p\  =  ;  i  in  which  \T  =  A  + 

x(a,/3,2)  1-1  Z^‘=2  IP 


The  major  findings  of  this  task  include:  (i)  Developed  an  approximation  algorithm  framework, 
called  Low-Degree  Percolation  (LDP)  Algorithm  Framework,  for  solving  Minimum  Dominating 
Set  (MDS),  Minimum  Vertex  Cover  (MVC)  and  Maximum  Independent  Set  (MIS)  problems  in 
power-law  graphs,  (ii)  Using  this  framework,  we  further  showed  a  theoretical  framework  to  derive 
the  approximation  ratios  for  these  optimization  problems  in  two  well-known  random  power-law 
graphs,  (iii)  Our  numerical  analysis  showed  that,  these  optimization  problems  can  be  approximated 
into  near  1  factor  with  high  probability,  using  our  proposed  LDP  algorithms,  in  power-law  graphs 
with  exponential  factor  (3  >  1.5,  which  belongs  to  the  range  of  most  real-world  networks.  The 
details  of  these  findings  can  be  found  in  [17]. 
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In  details,  the  approximation  algorithm  framework  to  solve  optimization  problems  is  based  on 
the  degree  sequence  property  in  power-law  graphs.  As  can  be  seen,  the  most  fundamental  property 
of  power-law  graphs  is  that  they  contain  a  large  number  of  low-degree  nodes,  while  only  a  small 
number  of  high-degree  nodes.  Therefore,  the  idea  of  our  proposed  Low-Degree  Percolation  (LDP) 
algorithm  framework  is  to  sort  the  nodes  by  their  degrees  and  percolate  the  graph  from  the  nodes  of 
lowest  degree.  The  process  continues  in  residual  graph  iteratively  until  no  more  nodes,  which  are 
surely  in  optimal  solution,  can  be  detected.  At  last,  we  applied  existing  approximation  approaches 
in  the  remaining  graph. 

For  MDS  and  MVC  problems,  as  shown  in  Algorithm  10,  since  the  node  incident  to  a  node 
of  degree  1  certainly  belongs  to  an  optimal  solution,  we  percolated  the  graph  by  adding  all  the 
neighbors  of  nodes  with  degree  1  in  each  iteration.  Until  no  more  nodes  of  degree  1  exists  in 
residual  graph,  we  applied  existing  approximation  algorithm  to  obtain  the  solution  in  this  residual 
graph. 


Algorithm  10:  LDP  Algorithm  for  MDS/MVC  Problems 


Input  :  Power-law  graph  G 
Output:  MDS  (or  MVC)  S 

1  while  3  Nodes  of  degree  1  do 

2  foreach  Node  v  of  degree  1  do 

3  Add  its  neighbor  N (v)  into  S; 

4  Remove  v  from  G; 

5  end 

6  Remove  all  nodes  incident  to  S  from  graph  G; 

i  end 

8  Determine  the  leftover  MDS  (or  MVC)  in  G  using  existing  approximation  algorithm  and 
add  them  into  S; 

9  return  S; 


On  the  other  hand.  Algorithm  1 1  shows  the  algorithm  for  MIS.  In  this  case,  the  nodes  of  degree 
1  will  belong  to  the  optimal  solution,  and  in  the  meanwhile,  it  is  certain  that  their  neighbors  cannot 
be  in  optimal  solution  any  more.  Therefore,  in  order  to  obtain  MIS,  we  selected  all  nodes  of  degree 
1  into  the  solution  in  each  iteration  and  ran  an  existing  approximation  algorithm  to  obtain  the  MIS 
in  the  remaining  graph. 

We  have  proved  its  approximation  ratio  in  the  following  Theorems: 

Theorem  17  (Main  Theorem  (MDS&MVC))  In  a  power-law  graph  G(ap),  by  using  Algorithm 
10,  MDS  and  MVC  can  be  approximated  into 

1  +  (tf  -  1)A 

with  probability  at  least  1  —  p\,  where  T  is  the  approximation  ratio  of  MDS  (or  MVC)  in  general 
graphs  w.r.t.  a  graph  of  size  at  most  ea  Jft- 

And  for  the  structural  random  power-law  (SRPL)  graph,  we  obtained  the  following  results: 
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Algorithm  11:  LDP  Algorithm  for  MIS  Problem 
Input  :  Power-law  graph  G 
Output:  MIS  S 

1  while  3  Nodes  of  degree  1  do 

2  foreach  Node  v  of  degree  1  do 

3  Add  v  into  S; 

4  Remove  v  and  all  its  neighbors  N(v)  from  G; 

5  end 

6  end 

7  Determine  the  leftover  MIS  in  G  using  existing  approximation  algorithm  and  add  them 
into  S; 

8  return  S; 

Theorem  18  In  a  SRPL  graph  G,  by  using  Algorithm  10,  MDS  and  MVC  can  be  approximated 
into 

1  +  (tf  -  1)A 

with  probability  at  least 

Le“/2j 

£  Pr[C2  =  r](l-j,J) 

T= 0 

where  p\  =  (Ar_M(a^2))2  :  i  in  which  XT  =  X  + 

x(a,/3,2) 

3  Training  and  Professional  Development 

3.1  Personnel  Supported 

This  grant  has  partially  supported  the  following  personnel. 

•  PI:  My  T.  Thai 

•  Ying  Xuan  (Ph.D,  graduated  in  Fall  2011,  currently  is  a  Research  Staff  at  IBM 

•  Thang  N.  Dinh  (Ph.D,  graduated  in  Spring  2013,  currently  is  an  Assistant  Professor  at  Virginia 
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•  Yilin  Shen  (Ph.D,  graduated  in  Spring  2013,  currently  is  a  Research  Scientist  at  Samsung  Amer¬ 
ica,  Research  Center) 

•  Nam  Nguyen  (Ph.D,  graduated  in  Spring  2013,  currently  is  an  Assistant  Professor  at  Towson 
University) 
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•  MS  Students:  Yu-Song  Syu  (graduated  in  Spring  2012) 

•  Subhankar  Mishra  (current  Ph.D.  student,  working  towards  his  degree.) 
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3.2  Training 

Several  of  my  Ph.D  students  have  been  working  with  me  on  this  project,  thus  it  of  course  provides 
them  several  advanced  professional  skills  in  doing  research  and  their  own  professional  develop¬ 
ments  as  shown  above.  They  have  been  trained  to  a  new  set  of  problems  along  with  the  developing 
of  several  fundamental  theories  and  algorithms.  The  results  of  these  works  have  been  included  in 
these  PhD  students’  dissertations  (please  see  Appendix).  Many  of  these  students  have  obtained 
either  faculty  positions  or  research  staffs  in  the  industry. 

The  students  have  attended  several  conferences  to  present  their  papers,  such  as  IEEE  INFO- 
COM  2010,  IEEE  MILCOM  2011,  ACM  Mobicom  201 1,  COCOON  2011,  IEEE  SocialCom  2011, 
IEEE  INFOCOM  2011,  ACM  CIKM  2012,  ACM  HyperText  2012,  ACM  WebSci  2012,  ACM 
ICDM  2013,  ACM  ASONAM  2013,  IEEE  ICDCS  2013,  IEEE  GLOBECOM  2013,  ACM  WI 
2014,  and  COCOON  2014. 

Since  the  work  is  on  the  complex  network,  I  have  used  several  results  here  to  offer  a  new 
course,  namely  Optimization  in  Adaptive  Complex  Systems  and  Social  Networks.  This  course  is  a 
graduate  level  course,  focusing  on  many  problems  arising  in  complex  networks  and  systems,  such 
that  vulnerability,  cascading  failures,  complex  network  models. 

3.3  Professional  Development 

During  the  period  of  this  project,  the  PI  is  an  associate  editor  for  Journal  of  Combinatorial  Opti¬ 
mization,  IEEE  Transactions  on  Parallel  and  Distributed  Systems,  Journal  of  Discrete  Mathemat¬ 
ics,  and  optimization  brief  series  editor  for  Springer.  She  is  also  a  PC  chair  for  several  conferences, 
including  IEEE  ISSPIT  2012,  IEEE  IWCMC  2012,  SIMPLEX  2011. 

DySON  14,  CSoNets  13,  IEEE  IWCMC  12,  IEEE  ISSPIT  12,  SIMPLEX  11,  DIS  11,  CO¬ 
COON  10,  CCNet  10 

The  PI  is  a  co-founder  and  EiC  of  a  new  Springer  journal,  namely  Computational  Social  Net¬ 
works.  She  has  created  and  chair  a  workshop  on  mathematics  of  social  networks 
(http  :  / / www.cise.ufl.edu/rhythai/CSoNet.html),  and  has  been  a  PC  members  of  many  con¬ 
ferences,  such  as  INFOCOM,  SOCIALCOM,  ICDCS.  She  is  founding  another  workshop  on  inter¬ 
dependent  networks,  co-located  with  the  first  rank  conference  IEEE  Infocom  2015,  namely  WIDN 
(pronounced  as  Widen).  Information  can  be  founded  at  http  :  // optnetsci.cise.ufl.edu/widn20l5/ 
The  PI  has  been  given  several  invited  talks  and  seminars,  listed  as  follows: 

•  “Interdependent  Networks  Analysis,”,  Learning  and  Intelligent  Optimization  Conference  (LION 
8),  Feb  16-21,  2014,  Florida.  Tutorial  Talk 

•  “Cybersecurity  in  an  Era  of  Online  Social  Networks”  Conference  on  Selected  Problems  on  IT 
and  Telecommunication,  Nov  14-15,  2013,  Danang,  Vietnam.  Plenary  Keynote 

•  “Dynamic  Community  Structure  Analysis  in  Complex  Networks,”  the  Int  Conference  on  the 
Dynamics  of  Information  Systems,  February  25-27,  2013.  USA.  Plenary  Keynote 

•  “Community  Structure  Analysis  in  Dynamic  Complex  Networks”  the  9th  AIMS  Conference 
on  Dynamic  Systems,  Differential  Equations  and  Applications,  July  1-5,  2012,  USA.  Invited 
Speaker 
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•  “How  the  Power-law  Distribution  Impacts  on  the  Complex  Network  Vulnerability”  Int  Work¬ 
shop  on  Complex  Networks,  March  7-9,  2012,  USA.  Invited  Speaker 

•  “Optimally  Use  of  Social  Networks  to  Manipulate  Information”,  University  of  Texas  Dallas, 
Department  of  Computer  Science,  September  2011.  Department  Colloquium  talk 


4  Results  Dissemination 

We  have  disseminated  the  results  in  several  avenues: 

•  Published  the  results  in  IEEE/ ACM  proceedings  and  journals 

•  Participated  and  presented  our  findings  at  conference  meetings  and  seminars 

•  Provided  source  codes  of  our  algorithms  per  requested  from  several  research  groups 

•  Developed  two  important  interactive  web-based  tools  which  tremendously  help  the  researchers 
in  the  networking  field  to  conduct  their  research  based  on  community  structure  concepts. 

1.  Identify  community  structure.  Complex  networks  exhibit  a  community  structure  prop¬ 
erty,  which  is  nodes  within  communities  are  densely  connected  than  that  between  communities. 
Identifying  community  structures  is  a  central  topic  of  network  science.  This  research  is  of  sig¬ 
nificant  importance  as  it  provides  insights  into  the  functionality  of  a  network  and  finds  itself 
extremely  useful  in  deriving  social-based  solutions.  My  group  has  developed  a  web-based  tool 
which  allows  researchers  to  input  any  network  of  interests,  and  the  tool  will  return  a  community 
structure  of  this  input.  The  tool  also  allows  users  to  interactively  rearrange  these  communities 
for  a  better  view. 

2.  Break  community  structure  with  the  minimum  cost.  Researchers  believed  that  commu¬ 
nities  are  very  strong  and  it  is  hard  to  be  broken  as  they  are  densely  connected.  However,  I  have 
shown  that  it  is  otherwise.  In  addition  to  the  published  theoretical  results,  I  have  developed  an 
online  interactive  tool  for  researchers  to  see  how  and  where  to  break  the  communities.  This  tool 
helps  researchers  gain  an  insight  to  the  structure  of  studied  networks  in  order  to  devise  better 
solutions  to  protect  such  networks. 


5  Honors/Awards 

•  The  PI  is  a  recipient  of  NSF  CAREER  award  2010-2015. 

•  The  PI  received  the  Provost’s  Excellence  Award  for  Assistant  Professors  at  the  University  of 
Florida,  2010 

•  The  PI  has  been  early  promoted  to  the  rank  of  Associate  Professor  in  2010 

•  Ph.D  students  Yilin  Shen  and  Thang  Dinh  have  received  UFIC  Outstanding  International  Student 
Awards 
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6  Dataset 


We  have  evaluated  our  algorithms  with  various  datasets,  described  in  the  following. 

6.1  Critical  Node  Detections 

The  three  networks  which  we  use  to  evaluate  the  performance  of  our  algorithms  related  to  the 

Critical  Node  Detection  problem  and  its  variants  are  described  as  follows: 

•  The  real  terrorist  network  compiled  by  Krebs  with  62  nodes  and  153  links,  which  reflects  the 
relationship  between  the  terrorists  involved  in  the  terrorism  attacks  of  Sep.  11,  2001.  This 
experiment  attempts  to  evaluate  the  performance  of  HILPR  on  a  real-world  social  network.  In 
order  to  breakdown  the  terrorist  network,  we  can  capture  the  individuals  corresponding  to  the 
critical  nodes  identified  by  HILPR. 

•  Waxman  network  topology,  a  widely-accepted  Internet  AS  topological  model,  is  generated  by 
the  well-known  BRITE. 

•  Power-law  network  topology,  generated  by  Barabasi  graph  generator 

•  Western  States  power  network  of  the  US  with  4941  nodes  and  6594  edges 

•  Small-world  network  topology  generated  by  igraph  library  using  Watts  and  Strogatz  model,  with 
k  —  2,  n  —  0.2  and  70  nodes. 

•  US  Network  Assets  compiled  with  71  nodes  and  98  edges,  which  provides  the  current  customer 
needs  in  XO  Communications  service. 

•  Erdos-Reyni:  A  random  graph  of  100  vertices  and  200  edges  following  the  Erdos-Reyni  model. 

•  Forest  fire:  A  random  power-law  graph  following  Forest  fire  model  by  Leskovec  et  al.  with  the 
forward  and  backward  burning  probabilities  0.3  and  0.9,  respectively. 

•  US  Backbone  network :  The  backbone  cabling  network  of  XO  company. 

•  CAIDA  AS:  The  CAIDA  AS  Relationships  Dataset  from  Sep.  17,  2007. 

•  Oregon  AS:  AS  peering  information  inferred  from  Oregon  route-views  between  March  31  and 
May  26,  2001.  Only  the  largest  connected  component  with  11,174  nodes  and  23,410  links  is 
considered. 

•  Gnutella  P2P:  Gnutella  peer-to-peer  network  from  from  Aug.  25,  2002.  Nodes  represents  hosts 
in  the  network  and  edges  are  the  connections  between  the  Gnutella  hosts.  It  consists  of  22,663 
nodes  and  108,386  edges 

•  Coauthor  network  in  Physics  sections  of  the  e-print  arXiv 

•  Partial  Facebook  with  63K  nodes  and  817K  edges 

•  Orkut  with  3M  nodes  and  223M  edges 
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6.2  Network  Structural  Interdependency  and  Vulnerability  Assessment 

To  detect  and  verify  the  community  structures,  we  evaluated  our  algorithms  in  the  following 
dataset,  described  in  the  table. 

Table  2:  Order  and  size  of  network  instances 


ID 

Name 

Vertices  (n) 

Edges  (m) 

1 

Zachary’s  karate  club 

34 

78 

2 

Dolphin’s  social  network 

62 

159 

3 

Les  Miserables 

77 

254 

4 

Books  about  US  politics 

105 

441 

5 

American  College  Football 

115 

613 

6 

US  Airport  97 

332 

2126 

7 

Electronic  Circuit  (s838) 

512 

819 

8 

Scientific  Collaboration 

1589 

2742 

We  also  evaluated  them  on  the  dataset  of  social  networks  as  mentioned  in  the  above  section, 

including  Facebook,  Twitter,  Orkut,  ENRON  Email,  ArXiv  Citation, 
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In  this  dissertation,  we  focus  on  analyzing  and  understanding  the  organizational 
principals,  assessing  the  structural  vulnerability  as  well  as  exploring  practical  applications 
of  dynamic  complex  networks.  In  particular,  we  propose  two  adaptive  frameworks 
for  identifying  the  nonoverlapping  and  overlapping  community  structure  in  dynamic 
networks.  Our  approaches  have  not  only  the  power  of  quickly  and  efficiently  updating 
the  network  communities,  but  also  the  ability  of  tracing  the  evolution  of  those  communities 
over  time.  We  also  suggest  a  detection  method  based  on  nonnegative  matrix  factorization 
which  can  work  on  weighted  and  directed  networks.  Consequently,  we  study  the 
discovery  of  stable  communities  in  the  networks,  i.e. ,  communities  which  are  tightly 
connected  and  remain  wealthy  even  over  a  long  period  of  time.  Furthermore,  we 
investigate  on  the  structural  vulnerability  of  the  network  community  structure  via 
identifying  key  nodes  that  play  an  important  role  in  maintaining  the  normal  function 
of  the  whole  system.  This  is  a  new  research  direction  on  the  cyber-infrastructure  that 
we  have  recently  introduced.  To  certify  the  effectiveness  of  our  suggested  frameworks 
and  algorithms,  we  extensively  test  them  on  not  only  synthesized  networks  but  also 
on  real-world  dynamic  traces.  Finally,  we  demonstrate  the  wide  applicability  of  our 
algorithms  via  realistic  applications,  such  as  the  limiting  misinformation  spread  in  Online 
Social  Networks  as  well  as  the  social-based  forwarding  and  routing  strategy  and  worm 
containment  in  Mobile  networks. 
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CHAPTER  1 
INTRODUCTION 

1.1  Community  Detection  in  Dynamic  Complex  Networks 

Many  complex  systems  in  reality  exhibit  the  property  of  containing  community 
structure  [37][85],  i.e.,  they  naturally  divide  into  groups  of  vertices  with  denser 
connections  inside  each  group  and  fewer  connections  crossing  groups,  where  vertices 
and  connections  represent  network  users  and  their  social  interactions,  respectively. 
Members  in  each  community  of  a  social  network  usually  share  things  in  common  such 
as  interests  in  photography,  movies,  music  or  discussion  topics  and  thus,  they  tend  to 
interact  more  frequently  with  each  other  than  with  members  outside  of  their  community. 
Community  detection  in  a  network  is  the  gathering  of  network  vertices  into  groups  in 
such  a  way  that  nodes  in  each  group  are  densely  connected  inside  and  sparser  outside. 

It  is  noteworthy  to  differentiate  between  community  detection  and  graph  clustering. 
These  two  problems  share  the  same  objective  of  partitioning  network  nodes  into  groups; 
however,  the  number  of  clusters  is  predefined  or  given  as  part  of  the  input  in  graph 
clustering  whereas  the  number  of  communities  is  typically  unknown  in  community 
detection.  Detecting  communities  in  a  network  provides  us  meaningful  insights  to 
its  internal  structure  as  well  as  its  organization  principles.  Furthermore,  knowing  the 
structure  of  network  communities  could  also  provide  us  more  helpful  points  of  view  to 
some  uncovered  parts  of  the  network,  thus  helps  in  preventing  potential  networking 
diseases  such  as  virus  or  worm  propagation.  Studies  on  community  detection  on  static 
networks  can  be  found  in  an  excellent  survey  [58]  as  well  as  in  the  work  of  [76][6][78][8] 
and  references  therein. 

Real-world  complex  networks,  however,  are  not  always  static.  In  fact,  most  of 
complex  systems  in  reality  (such  as  Facebook,  Bebo  and  Twitter  in  OSNs)  evolve  and 
witness  an  expand  in  size  and  space  as  their  users  increase,  thus  lend  themselves  to 
the  field  of  dynamic  networks.  A  dynamic  network  is  a  special  type  of  evolving  complex 
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networks  in  which  changes  are  frequently  introduced  over  time.  In  the  sense  of  an  online 
social  network,  such  as  Facebook,  Twitter  or  Flickr,  changes  are  usually  introduced  by 
users  joining  in  or  withdrawing  from  one  or  more  groups  or  communities,  by  friends  and 
friends  connecting  together,  or  by  new  people  making  friend  with  each  other.  Any  of 
these  events  seems  to  have  a  little  effect  to  a  local  structure  of  the  network  on  one  hand; 
the  dynamics  of  the  network  over  a  long  period  of  time,  on  the  other  hand,  may  lead  to 
a  significant  transformation  of  the  network  community  structure,  thus  raises  a  natural 
need  of  reidentification.  However,  the  rapidly  and  unpredictably  changing  topology  of  a 
dynamic  social  network  makes  it  an  extremely  complicated  yet  challenging  problem. 

Although  one  can  possibly  run  any  of  the  static  community  detection  methods, 
which  are  widely  available  [76][6][78][1 7],  to  find  the  new  community  structure  whenever 
the  network  is  updated,  he  may  encounter  some  disadvantages  that  cannot  be 
neglected:  (1)  the  long  running  time  of  a  specific  static  method  on  large  networks  (2)  the 
trap  of  local  optima  and  (3)  the  almost  same  reaction  to  a  small  change  to  some  local 
part  of  the  network.  A  better,  much  efficient  and  less  time  consuming  way  to  accomplish 
this  expensive  task  is  to  adaptively  update  the  network  communities  from  the  previous 
known  structures,  which  helps  to  avoid  the  hassle  of  recomputing  from  scratch.  This 
adaptive  approach  is  the  main  focus  of  our  study  in  this  paper.  In  Figure  1  -  1,  we  briefly 
generalize  the  idea  of  dynamic  network  community  structure  adaptation.  Here,  the 
network  evolves  from  time  t  to  t  +  1  under  the  change  A  Gt.  The  adaptive  algorithm  A 
quickly  finds  the  new  community  structure  C(Gt+1)  based  on  the  previous  structure  C(Gt) 
together  with  the  changes  A Gt. 

1.2  Nonnegative  Matrix  Factorization  for  Community  Detection 

Community  identification  on  complex  networks  is  a  well-established  field  and  many 
efficient  graph-based  methods  have  been  introduced  in  the  literature  (see  [32]  for  an 
excellent  survey).  Unfortunately,  these  methods  expose  the  strong  dependence  on  some 
local  parts  of  the  network  topology  as  well  as  the  implicit  meaning  and  interpretation 
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g  :  Gt  —  >  Gt+ 1 

C  :  C(G*)-"->C(Gm) 

Figure  1-1.  The  general  framework  for  our  adaptive  community  detection  algorithm  A 

from  the  detected  overlapping  communities.  Recently,  NMF-based  algorithms  for 
detecting  network  communities  have  gained  great  attention  due  to  its  meaningful 
interpretation  [102],  In  general,  an  NMF  problem  asks  for,  given  a  nonnegative  matrix 
X  g  Mmxm  and  a  number  k  -c  min {m,  /?},  nonnegative  matrices  W  e  Rmxk  and 
H  e  Rkxn  such  that  ||X  -  WH\\  is  minimized,  where  ||  ■  ||  is  a  cost  function  (usually 
the  Frobenius  distance  or  l-divergence).  One  notable  property  of  NMF  is  its  close 
relationship  to  K-mean  clustering  and  graph  partitioning  [67][24],  which  also  closely 
relates  to  community  identification. 

A  few  attempts  have  been  suggested  on  this  line  of  method.  Lin  et  al  proposed 
MetaFac  [72],  a  NMF-based  method  for  extracting  community  structure  through 
relational  hypergraphs.  This  method,  however,  is  not  capable  for  identifying  overlapped 
structures.  In  [90],  Prorakis  et  al.  recently  proposed  an  approach  for  finding  overlapping 
communities  using  a  Bayesian  NMF  based  on  hyperparameters.  This  method  has 
the  advantages  of  automatically  determining  the  number  of  communities  and  not 
suffering  from  the  resolution  limit.  Unfortunately,  its  built-in  estimate  of  the  number  of 
communities  could  mislead  the  factorization  to  return  a  bad  solution.  In  [103],  Wang 
et  al.  proposed  NMF  methods  on  the  Frobenius  norm  with  the  capability  of  extracting 
overlapped  structures.  However,  we  find  these  approaches  do  not  appear  to  perform 
well  on  weighted  directed  networks  as  shown  in  the  experiments. 

To  overcome  the  above  limitations,  we  introduce  two  NMF  approaches,  namely 
iSNMF  and  iANMF,  for  effectively  identifying  social  network  communities  with  meaningful 
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interpretations.  In  particular,  we  are  interested  in  approximating  X  «  HSHT  since 
this  factorization  provides  us  H  as  the  community  indicator  matrix  and  S  as  the 
community-interaction  strength  matrix,  respectively.  This  factorization,  as  a  result,  nicely 
reflects  the  overlap  of  network  communities  and  promises  a  meaningful  community 
interpretation  that  is  independent  of  the  network  topology. 

1 .3  Applications  of  The  Network  Community  Structure 

Detecting  community  structure  of  a  dynamic  social  network  is  of  considerable  uses. 
To  give  a  sense  of  it,  consider  the  routing  problem  in  communication  network  where 
nodes  and  links  present  people  and  mobile  communications,  respectively.  Due  to  nodes 
mobility  and  unstable  links  properties  of  the  network,  designing  an  efficient  routing 
scheme  is  extremely  challenging.  However,  since  people  have  a  natural  tendency 
to  form  groups  of  communication,  there  exist  groups  of  nodes  which  are  densely 
connected  inside  than  outside  in  the  underlying  MANET  as  a  reflection,  and  therefore, 
forms  community  structure  in  that  MANET.  An  effective  routing  algorithm,  as  soon  as 
it  discovers  the  network  community  structure,  can  directly  route  or  forward  messages 
to  nodes  in  the  same  (or  to  the  related)  community  as  the  destination.  By  doing  this 
way,  we  can  avoid  unnecessary  messages  forwarding  through  nodes  in  different 
communities,  thus  can  lower  down  the  number  of  duplicate  messages  as  well  as  reduce 
the  overhead  information,  which  are  essential  in  MANETs. 

Another  great  example  includes  the  worm  containment  in  cellular  networks  [1 1 0], 
or  in  OSNs  [81][82],  Nowadays,  many  social  applications  such  as  Facebook,  Twitter 
and  Foursquare,  are  able  to  run  on  open-API  enabled  mobile  devices  like  PDAs  and 
Iphones.  However,  if  such  an  application  is  infected  with  malicious  software,  such  as 
worms  or  viruses,  this  openness  will  also  make  it  easier  for  their  propagation.  A  possible 
solution  to  prevent  worms  from  spreading  out  wider  is  to  send  patches  to  critical  users 
and  let  them  redistribute  to  the  others.  Intuitively,  the  smaller  the  set  of  important  users 
for  sending  patches,  the  better.  But  how  can  we  effectively  choose  that  set  of  minimal 
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size?  This  is  where  community  structure  comes  into  the  picture  and  helps.  In  particular, 
we  show  that  selecting  users  in  the  boundaries  of  the  overlapped  nodes  gives  a  tighter 
and  more  efficient  set  of  influential  users,  thus  significantly  lowers  the  number  of  sent 
patches  as  well  as  overhead  information,  which  are  essential  in  cellular  networks  and 
OSNs. 

1.4  The  Identification  of  Stable  Communities 

OSNs  in  reality  are  highly  dynamic  as  social  interactions  on  them  tend  to  come  and 
go  quickly.  Consequently,  their  communities  are  also  dynamical  and  evolve  heavily  as 
the  networks  change  over  time.  However,  Palla  et  al.  observe  in  their  seminal  work  that 
some  communities  in  social  networks  are  tightly  connected  and  remain  wealthy  even 
over  a  long  period  of  time  [86].  The  authors  also  point  out  that  large-size  communities 
with  a  high  internal  densities  and  less  external  distractions  tend  to  remain  stable  during 
the  network  evolution,  which  intuitively  agree  with  the  findings  reported  in  [49],  These 
observations  reassemble  the  concept  of  stable  communities  in  OSNs.  For  example, 
stable  communities  on  Facebook  can  be  visualized  as  groups  of  users  who  devoted 
themselves  to  one  particular  interest  such  as  movie,  music  or  photography.  Likewise, 
a  stable  community  in  Twitter  can  be  illustrated  via  a  group  of  users  who  may  follow 
many  but  only  loyal  to  a  specific  celebrity.  In  a  different  perspective,  stable  communities 
in  a  citation  network  may  refer  to  well-established  research  topics  in  the  field  whereas 
unstable  communities  may  represent  topical  or  recently  arising  research  directions. 

The  discovery  of  these  stable  communities,  as  a  consequence,  will  provide  us 
valuable  insights  into  the  core  properties  and  characteristics  of  not  only  each  community 
but  also  of  the  network  as  a  whole.  This  knowledge  can  further  benefit  information 
retrieval  in  OSNs  as  searches  can  be  redirected  to  stable  communities  sharing  the 
most  similar  characteristics  to  the  queries  for  more  meaningful  answers.  For  instance, 
the  search  for  well-established  research  topics  in  a  citation  network  can  be  mined 
more  effectively  when  one  looks  at  its  stable  rather  than  unstable  communities,  as 
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discussed  above.  However,  the  large-scale  and  nonreciprocal  topologies  of  OSNs  in 
reality  make  the  detection  of  stable  community  structure  an  extremely  challenging  yet 
topical  problem. 

1.5  The  Assessment  of  Network  Community  Structure  Vulnerability 

Complex  systems,  despite  their  diversity  in  physical  infrastructures  and  underlying 
interactions,  expose  to  be  extremely  vulnerable  under  node  attacks.  In  some  scenarios, 
the  failures  of  only  a  few  key  nodes  are  enough  to  bring  the  whole  network  operation 
down  to  its  knees  [25].  More  importantly,  this  vulnerability  can  further  be  propagated 
to  a  wider  population,  leading  to  a  much  more  devastating  consequence.  In  order  to 
develop  a  comprehensive  understanding  on  this  type  of  attack,  it  is  therefore  important 
to  understand  not  only  the  impact  of  nodes’  failures  on  the  network  components  but  also 
the  inner  and  interdependency  among  those  components  [88].  Particularly,  it  is  crucial 
to  explore  how  the  failure  of  a  single  node,  or  a  set  of  nodes  in  general,  can  significantly 
change  the  structure  of  the  network  components  as  well  as  how  these  components 
would  affect  each  other  in  cases  of  attacks.  However,  the  large  scale  and  dynamical 
properties  of  complex  systems  in  practice  make  this  a  complicated  problem. 

To  tackle  this  problem,  we  introduce  the  use  of  network  modules  to  study  both 
the  impact  of  nodes’s  failures  and  the  network  component  interdependency.  There 
are  several  reasons  and  benefits  behind  this  approach.  First  of  all,  investigating  the 
interdependencies  based  on  the  topology  of  the  underlying  network  structures  is  a  major 
aspect  that  must  be  considered  to  understand  the  behavior  of  structural  vulnerability 
[88].  Secondly,  most  complex  networks  commonly  exhibit  modular  property,  or  in  other 
words,  they  exhibit  to  contain  community  structure  in  their  underlying  organizations. 

That  is,  the  network  nodes  can  be  gathered  into  groups  in  such  a  way  that  each  group 
is  densely  connected  internally  and  sparsely  connected  externally  [38][75].  Nodes  in 
each  community  usually  share  similar  functions  and  characteristics  that  distinguish 
themselves  from  the  others.  In  a  broader  view,  communities  displays  the  whole  network 
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Figure  1-2.  The  classification  of  community  detection  algorithms  in  complex  networks. 

structure  as  a  compact  and  more  understandable  level  where  a  community  may 
represent  an  entity  or  a  functional  group  in  the  system.  At  this  level,  element  failures 
in  one  community  can  have  a  profound  impact  which  can  consequently  lead  to  changes 
of  other  communities.  Therefore,  identifying  network  elements  that  are  essential  to  its 
community  structure  is  a  fundamental  and  extremely  important  issue.  To  the  best  of  our 
knowledge,  this  research  direction  has  not  been  addressed  so  far  in  the  literature. 

1.6  Literature  Review 
Community  detection  in  dynamic  networks 

Community  detection  in  complex  networks  is  a  well  established  field  and  a 
tremendous  number  of  identification  methods  has  been  proposed  in  the  literature. 

Some  notable  approach  directions  include  classical  graph  clustering  algorithms  [4][73], 
dynamic  approaches  [92],  modularity  optimization  methods  [75],  statistical  inference 
[79]  or  random  walk  for  community  detection  [28]  (see  [32]  and  references  therein  for  an 
excellent  survey). 

In  a  general  view,  community  structure  detection  algorithms  for  complex  network 
can  be  classified  in  different  ways:  either  by  nonoverlapping  or  overlapping  detection 
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algorithms,  by  static  or  dynamic  algorithms,  or  by  algorithms  for  directed  and  undirected 
networks,  etc.  Figure  1-2  describes  a  details  classification  of  16  different  types  of 
identification  algorithms. 

Community  detection  on  static  networks  has  attracted  a  lot  of  attentions  and  many 
efficient  methods  have  been  proposed  for  this  type  of  networks.  Detecting  community 
structure  on  dynamic  networks,  however,  has  so  far  been  an  untrodden  area.  A  recent 
work  of  Pal  la  et  al.  [85]  proposed  an  innovative  method  for  detecting  communities  on 
dynamic  networks  based  on  the  /(-clique  percolation  technique.  This  approach  can 
detect  overlapping  communities;  however,  it  is  time  consuming,  especially  on  large 
scale  networks.  Another  recent  work  of  Zhang  et  al.  [109]  proposed  a  detection  method 
based  on  contradicting  the  network  topology  and  the  topology-based  propinquity,  where 
propinquity  is  the  probability  of  a  pair  of  nodes  involved  in  a  community.  A  work  in  [98] 
presented  a  parameter-free  methodology  for  detecting  clusters  on  time-evolving  graphs 
based  on  mutual  information  and  entropy  functions  of  Information  Theory.  Hui  et  al.  [48] 
proposed  a  distributed  method  for  community  detection  in  which  modularity  was  used 
as  a  measure  instead  of  objective  function.  A  part  from  that,  [44]  attempted  to  track  the 
evolving  of  communities  over  time,  using  a  few  static  network  snapshots. 

In  [99],  the  authors  present  a  framework  for  identifying  dynamic  communities  with 
a  constant  factor  approximation.  However,  this  method  does  not  seem  to  make  sense 
on  real-world  social  networks  since  it  requires  some  predefined  penalty  costs  which  are 
generally  unknown  on  dynamic  networks.  A  recent  work  [26],  Thang  et  al.  proposed  a 
social-aware  routing  strategy  in  MANETs  which  also  makes  uses  of  a  modularity-based 
procedure  name  MIEN  for  quickly  updating  the  network  structure.  In  particular,  MIEN 
tries  to  compose  and  decompose  network  modules  in  order  to  keep  up  with  the  changes 
and  uses  fast  modularity  algorithm  [76]  to  update  the  network  modules.  However,  this 
method  performs  slowly  on  large  scale  dynamic  networks  due  to  the  high  complexity  of 
[76]. 
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In  [70],  Lin  et  al.  proposed  FacetNet,  a  framework  for  analyzing  communities  in 
dynamic  networks  based  on  the  optimization  of  snapshot  costs.  FacetNet  is  guaranteed 
to  converge  to  a  local  optimal  solution;  however,  its  convergence  speed  is  slow  and 
its  input  asks  for  the  number  of  network  communities  which  are  usually  unknown 
in  practice.  In  [27],  Duan  et  al.  proposed  Stream-Group,  an  incremental  method 
to  solve  the  community  mining  and  detect  the  change  points  in  weighted  dynamic 
graphs.  This  method  is  modularity-based  thus  may  inherit  the  resolution  limit  while 
discovering  network  communities.  In  another  attempt,  Kim  et  al.  [52]  suggested  a 
particle-and-density  based  clustering  method  for  dynamic  networks,  based  on  the 
extended  modularity  and  the  concepts  of  nano-community  and  /-quasi-clique-by-clique. 
Apart  from  that,  the  work  of  Cazabet  et  al.  [9]  proposed  iLCD  method  to  find  the 
overlapping  network  communities  by  adding  edges  and  then  merging  similar  ones. 
However,  this  model  might  not  be  sufficient  in  consideration  with  the  dynamic  behaviors 
of  the  network  when  new  nodes  are  introduced  or  removed,  or  when  existing  edges 
are  removed  from  the  network.  In  [60],  the  author  presented  OSLOM,  a  framework  for 
testing  the  statistical  significance  of  a  cluster  with  respect  to  a  global  null  model  (e.g.,  a 
random  graph).  To  expand  a  community,  OSLOM  locally  computes  the  value  r  for  each 
neighbor  node  and  tries  to  include  that  node  into  the  current  community. 

Nonnegative  matrix  factorization  for  community  detection 

Community  detection  on  complex  networks  is  a  mature  research  area  and  besides 
NMF-based  algorithms,  many  effective  graph-based  or  topology-based  algorithms 
have  been  proposed  for  this  purpose.  In  general,  detection  methods  can  be  classified 
into  non-overlapping  (disjoint)  and  overlapping  algorithms.  Traditional  non-overlapping 
algorithms  [75][17][77]  may  return  good  community  identification  results,  however  are 
not  able  to  reveal  the  overlapped  network  structures,  particularly  on  social  networks. 

On  the  other  category,  algorithms  for  graph-based  and  topological-based  detection 
of  overlapping  communities  have  also  been  proposed  in  the  literature.  Most  of  them 
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are  based  on  the  clique-percolation  [84]  or  clique  extension  [63]  techniques,  on  the 
extended  modularity  [62][83],  on  a  specific  fitness  function  [56],  on  label  propagation 
[40],  or  link-based  technique  [1].  See  [32]  and  references  therein  for  an  excellent  survey 
on  those  detection  methods. 

Although  the  success  of  these  aforementioned  algorithms  have  been  theoretically 
and  empirically  verified,  they  still  expose  the  following  limitations:  (1)  The  strong 
dependence  on  some  local  parts  of  the  network  topology,  e.g.,  the  clique-percolation 
method  depends  on  some  dense  subnetworks  in  order  to  percolate,  a  link-based 
technique  relies  on  potential  links  with  highest  degrees,  a  modularity-based  technique 
depends  on  the  network  hierarchy  in  order  to  maximize  modularity,  etc,  and  (2)  The 
implicit  meaning  and  interpretation  from  the  detected  overlapping  communities,  e.g., 
what  is  the  contribution  of  an  overlapped  node  to  these  percolated-cliques  or  why  would 
it  even  be  there?  These  shortcomings  of  these  methods  drive  the  need  for  a  better 
approach  with  a  more  meaningful  interpretation. 

Stable  community  detection 

The  discovery  of  stable  communities,  on  the  contrary,  is  still  an  untrodden  area 
with  only  a  few  attempts  has  been  suggested  [23][59][68].  This  special  property  of 
network  communities  was  perhaps  first  observed  by  Pal  la  et  al.  in  his  seminal  work  [86], 
where  they  point  out  that  tight-knit  communities  with  high  internal  densities  and  less 
external  distractions  tend  to  remain  strong  over  time,  thereby  reassembles  the  concept 
of  community  stability.  Delvenne  et  al.  [23]  extend  this  general  concept  to  proposed 
an  measure,  called  “stability  of  the  clustering  r(t,  /-/)”,  to  quantify  how  stable  a  given 
cluster  (or  community  structure)  H  is  at  a  specific  time  step  t  based  on  the  Markov 
Autocovariance  model.  Under  this  notation,  a  cluster  H  is  stable  at  time  t  if  a  high  value 
of  r(t,  H )  is  observed.  This  quantity,  instead,  is  more  appropriate  for  verification  rather 
than  identification  of  stable  network  communities  since  it  requires  the  specification  of 
time  step  t  a  prior. 
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In  a  different  approach,  Lancichinetti  et  al.  [59]  investigate  on  the  consensus  of 
community  detection  methods.  The  authors  report  that,  given  a  particular  algorithm 
A,  the  consensus  on  communities  found  by  A  after  multiple  runs  dramatically  improve 
the  quality  of  the  detection,  henceforth  suggest  that  those  communities  are  candidates 
for  stable  structures.  This  is  a  very  interesting  approach,  however,  might  encounter 
some  disadvantages  of  (1 )  the  expensive  computational  cost  and  time  consuming, 
and  (2)  the  convergence  of  the  whole  iterative  process  is  not  guaranteed.  In  a 
recent  attempt,  Yanhua  et  al.  [68]  utilize  the  concept  of  mutual  links  and  suggest  an 
spectral-clustering-based  identification  method  that  tries  to  maximize  the  total  mutual 
connections  in  order  to  find  stable  communities.  However,  there  are  possibilities  that 
some  mutual  links  are  of  low  magnitudes,  and  thus,  do  not  significantly  contribute  to  the 
overall  stability  at  the  community  level. 

Structural  vulnerability  assessment  of  community  structure 

Community  structure  and  complex  network  vulnerability  are  the  two  major  and 
well-developed  areas  of  networking  research.  Surveys  on  community  structure 
detection  algorithms  as  well  as  methods  for  assessing  network  vulnerabilities  can 
be  found  in  the  work  of  Fortunatos  et.  al.  [32],  and  Grubesic  et.  al.  [41],  respectively. 
However,  assessing  the  vulnerability  of  network  community  structure  has  so  far  been  an 
untrodden  area.  A  large  body  of  work  has  been  devoted  to  find  the  node  roles  within  a 
community  by  a  link-based  technique  together  with  a  modification  of  node  degree  [95], 
by  using  the  spectrum  of  the  graph  [1 05],  by  using  a  within-module  degree  and  their 
participation  coefficient  [42],  or  by  the  detection  of  key  nodes,  overlapping  communities 
and  “date”  and  “party”  hubs  [54],  However,  none  of  these  approaches  discuss  how  the 
community  structure  would  change  in  the  failure  of  those  important  nodes,  especially  in 
terms  of  NMI  measure. 

The  vulnerability  of  network  function  and  structure  has  been  examined  under  the 
node  centrality  metrics,  such  as  high  degree  and  betweeness  centrality,  as  well  as 
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under  the  average  shortest  path  which  tries  to  signify  the  lengths  of  shortest  distances 
between  node  pairs  [41],  under  the  pairwise  connectivity  metric  whose  goal  aims  to 
break  the  network’s  pairwise  connectivity  down  to  a  certain  level  [25],  or  under  the 
available  number  of  compromised  s  -  t  flows  [74],  etc.  However,  there  is  an  even  more 
crucial  risk  that  could  dramatically  affect  the  normal  network  functionality  that  has  not 
been  addressed  so  far:  the  transformation  or  restruction  of  the  network  community 
structure.  Due  to  its  vital  role  in  the  network,  any  significant  restruction  or  transformation 
of  the  community  structure,  resulted  from  important  node  removal,  can  potentially 
change  the  entire  network  organization  and  consequently  lead  to  a  malfunction  or 
unpredictable  corruption  of  the  whole  network. 

1.7  Dissertation  outline 

In  chapter  2,  we  propose  QCA,  a  fast  and  adaptive  method  for  efficiently  identifying 
the  nonoverlapping  community  structure  of  a  dynamic  social  network.  Our  approach 
takes  into  account  the  discovered  structures  and  processes  on  network  changes 
only,  thus  significantly  reduces  computational  cost  and  processing  time.  We  study  the 
dynamics  of  a  social  network  and  prove  theoretical  results  regarding  its  communities’ 
behaviors  over  time,  which  are  the  bases  of  our  method.  We  extensively  evaluate  our 
algorithms  on  both  synthesized  and  real  dynamic  social  traces.  Experimental  results 
show  that  QCA  achieves  not  only  competitive  modularity  scores  but  also  high  quality 
community  structures  in  a  timely  manner.  We  apply  QCA  method  to  worm  containment 
problem  in  OSNs.  Simulation  results  show  that  QCA  outperforms  current  available 
methods  and  confirm  its  applicability  in  social  network  problems. 

In  chapter  3,  we  suggest  AFOCS,  a  two-phase  adaptive  framework  for  not  only 
detecting  and  updating  the  overlapping  network  communities  but  also  tracing  their 
evolution  over  time.  Theoretical  analyses  show  AFOCS  partially  achieves  more  than 
74%  the  internal  density  of  the  optimal  solution.  Second,  we  evaluate  AFOCS  on 
both  synthesized  and  real  traces  in  comparison  to  both  the  state-of-the-art  and  the 
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most  popular  static  detection  methods  COPRA  and  CFinder,  as  well  as  to  recent 
adaptive  methods  FacetNet,  iLCD  and  OSLOM.  Empirical  results  show  that  AFOCS 
achieves  both  competitively  results  and  high  quality  community  structures  in  a  timely 
manner.  Finally,  with  AFOCS,  we  suggest  a  community  based  forwarding  strategy  for 
communication  networks  that  reduces  up  to  1 1  x  overhead  information  while  maintaining 
competitively  delivery  time  and  ratio.  We  also  propose  a  new  social-aware  patching 
scheme  for  containing  worms  in  OSNs,  which  helps  reducing  up  to  7x  the  infection  rates 
on  Facebook  dataset. 

We  analyze  two  NMF  approaches  in  chapter  4,  namely  iSNMF  and  iANMF,  for 
effectively  identifying  social  network  communities  with  meaningful  interpretations. 

In  particular,  we  are  interested  in  approximating  X  «  HSHT  since  this  factorization 
provides  us  H  as  the  foundation  feature  matrix  and  S  as  the  feature  interaction 
matrix.  Alternatively,  H  and  S  can  also  be  thought  of  as  community  indicator  and 
inter-community  strength  matrices  whose  row  elements  can  further  be  interpreted 
as  probabilities  of  nodes  belonging  to  different  communities.  This  factorization,  as  a 
result,  nicely  reflects  the  overlap  of  network  communities  and  promises  a  meaningful 
community  interpretation  that  is  independent  of  the  network  topology. 

In  an  application  perspective,  we  illustrate  the  practical  applications  of  the  network 
community  structure  via  two  emerging  problems  on  social  and  mobile  computing, 
namely  the  Worm  spread  containment  problem  on  online  social  networks  (chapter 
5)  and  the  forwarding  and  routing  strategy  (chapter  6)  on  mobile  networks.  We 
demonstrate  that  methods  and  strategies  employing  QCA  and  FOCS  as  community 
detection  cores  obtain  a  significant  improvement  in  term  of  performance  and  solution 
quality.  These  realistic  applications  brighten  the  wide  applicability  of  the  network 
community  structure  many  problems  enabled  my  complex  networks. 

In  chapter  7,  we  suggest  an  estimation  which  provides  helpful  insights  into  the 
stability  of  links  in  the  input  network.  Based  on  that,  we  propose  SCD  -  a  framework 
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to  identify  community  structure  in  directional  OSNs  with  the  advantage  of  community 
stability.  We  next  explore  an  essential  connection  between  the  persistence  probability 
of  a  community  at  the  stationary  distribution  and  its  local  topology,  which  is  the 
fundamental  mathematical  theory  to  support  the  SCD  framework.  To  certify  the 
efficiency  of  our  approach,  we  extensively  test  SCD  on  both  synthesized  datasets 
with  embedded  communities  and  real-world  social  traces,  including  NetHEPT  and 
NetHEPT  WC  collaboration  networks  as  well  as  Facebook  social  networks,  in  reference 
to  the  consensus  of  other  state-of-the-art  detection  methods.  Highly  competitive 
empirical  results  confirm  the  quality  and  efficiency  of  SCD  on  identifying  stable 
communities  in  OSNs. 

In  chapter  8,  we  introduce  CSV  problem  to  assess  the  impact  of  nodes’  failures 
on  the  network  community  structure.  To  the  best  of  our  knowledge,  this  is  the  first 
attempt  in  this  line  of  research.  We  analyze  possible  conditions  that  can  lead  to  the 
minimization  of  NMI  on  network  community  structures.  We  suggest  the  concept 
of  generating  edges  of  a  community  and  provide  an  optimal  solution  for  finding  a 
MGES.  We  propose  genEdge,  a  node  selection  strategy  for  CSV  based  on  the  MGES 
solution.  We  conducted  experiments  on  both  synthesized  data  with  known  community 
structures  and  real  world  traces.  Empirical  results  reveal  that  genEdge  outperforms 
other  node  selection  strategies  in  terms  of  solution  quality  as  well  as  in  reference  to 
different  underlying  community  detection  algorithms.  In  an  application  perspective,  we 
demonstrate  the  critical  importance  of  CSV  via  the  forwarding  and  routing  strategies 
in  delay  tolerant  networks  (DTNs),  where  the  failures  of  some  important  devices 
significantly  degrade  the  entire  system’s  performance. 

Finally,  we  summary  our  contributions  and  conclude  the  dissertation  in  chapter  9. 
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CHAPTER  2 

NONOVERLAPPING  COMMUNITY  STRUCTURE  DETECTION 

In  this  chapter,  we  present  QCA,  our  proposed  algorithms  for  detecting  nonoverlapping 
community  structure  in  a  dynamic  complex  network.  In  the  following  sections,  we  first 
introduce  the  preliminaries  in  section  2.1  and  then  describe  our  QCA  method  in  detail 
in  section  2.2.  Finally,  the  empirical  evaluations  of  QCA  on  both  synthesized  and  real 
datasets  are  presented  in  section  2.3. 

2.1  Problem  Definition 

We  first  present  the  notations,  objective  function  as  well  as  the  dynamic  graph 
model  representing  a  social  network  that  we  will  use  throughout  this  section. 

(Notation)  Let  G  =  (V,  E)  be  an  undirected  unweighted  graph  with  N  nodes  and 
M  links  representing  a  social  network.  Let  C  =  {Q,  C2, Ck}  denote  a  collection  of 
disjoint  communities,  where  C,  e  C  is  a  community  of  G.  For  each  vertex  u,  denote  by 
du,  C(u )  and  NC(u)  its  degree,  the  community  containing  u  and  the  set  of  its  adjacent 
communities.  Furthermore,  for  any  S  c  V,  let  ms,  ds  and  be  the  number  of  links 
inside  S,  the  total  degree  of  vertices  in  S  and  the  number  of  connections  from  u  to  S, 
respectively.  The  pairs  of  terms  community  and  module;  node  and  vertex  as  well  as 
edge  and  link  and  are  used  interchangeably. 

(Dynamic  social  network)  Let  Gs  =  (Vs,  Es)  be  a  time  dependent  network 
snapshot  recorded  at  time  s.  Denote  by  AVS  and  AES  the  sets  of  vertices  and  links 
to  be  introduced  (or  removed)  at  time  s  and  let  A Gs  =  ( AVS ,  A Es)  denote  the  change  in 
term  of  the  whole  network.  The  next  network  snapshot  Gs+1  is  the  current  one  together 
with  changes,  i.e.,  Gs+1  =  Gs  u  A Gs.  A  dynamic  network  Q  is  a  sequence  of  network 
snapshots  evolving  over  time:  Q  =  (G°,  G1, Gs). 

(Objective  function)  In  order  to  quantify  the  goodness  of  a  network  community 
structure,  we  take  into  account  the  most  widely  accepted  measure  called  modularity  Q 


25 


[78],  which  is  defined  as: 


<?  =  £( 
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mc 
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AM 2 


)■ 


Basically,  Q  is  the  fraction  of  all  links  within  communities  subtracts  the  expected  value 
of  the  same  quantity  in  a  graph  whose  nodes  have  the  same  degrees  but  links  are 
distributed  randomly,  and  the  higher  modularity  Q,  the  better  network  community 
structure  is.  Therefore,  our  objective  is  to  find  a  community  assignment  for  each 
vertex  in  the  network  such  that  Q  is  maximized.  Modularity,  just  like  other  quality 
measurements  for  community  identifications,  has  some  certain  disadvantages  such  as 
its  non-locality  and  scaling  behavior  [8],  or  resolution  limit  [35].  However,  it  is  still  very 
well  considered  due  to  its  robustness  and  usefulness  that  closely  agree  with  intuition  on 
a  wide  range  of  real  world  networks. 

Problem  Definition:  Given  a  dynamic  social  network  Q  =  (G°,  G1, Gs )  where  G°  is 
the  original  network  and  G1,  G2,..,  Gs  are  the  network  snapshots  obtained  through  AG1, 
AG2,..,  A Gs,  we  need  to  devise  an  adaptive  algorithm  to  efficiently  detect  and  identify 
the  network  community  structure  at  any  time  point  utilizing  the  information  from  the 
previous  snapshots  as  well  as  tracing  the  evolution  of  the  network  community  structure. 

2.2  Algorithm  Description 

Let  us  first  discuss  how  changes  to  the  evolving  network  topology  affect  the 
structure  of  its  communities.  We  use  the  term  intra-community  links  to  denote  edges 
whose  two  endpoints  belong  to  the  same  community,  and  the  term  inter-community  links 
to  denote  those  with  endpoints  connecting  different  communities.  For  each  community 
C,  the  connections  linking  G  with  other  communities  are  much  fewer  than  those  within  C 
itself,  i.e.,  nodes  in  C  are  densely  connected  inside  and  less  densely  connected  outside. 
Intuitively,  adding  intra-community  links  inside  or  removing  inter-community  links 
between  communities  of  G  will  strengthen  those  communities  and  make  the  structure  of 
G  more  clear.  Vice  versa,  removing  intra-community  links  and  inserting  inter-community 
links  will  loosen  the  structure  of  G.  The  community  updating  process,  as  a  result,  is 
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challenging  since  an  insignificant  change  in  the  network  topology  can  possibly  lead  to  an 
unexpected  transformation  of  its  community  structure. 

We  will  discuss  in  detail  possible  behaviors  of  dynamic  network  communities  in 
Figure  2-1.  2-1  A:  New  edge  (u,  v)\  u  and  v  are  first  checked  and  memberships  are 
then  tested  on  X  and  Y.  2-1 B:  (a)  The  original  community  (b)  After  the  dotted  edge  is 
removed,  two  smaller  communities  arise.  2-1 C:  (a)  The  original  four  communities  (b) 
After  the  central  node  is  removed,  the  leftover  nodes  join  in  different  modules,  forming 
three  new  communities.  2-ID:  (a)  The  original  community  (b)  When  g  is  removed,  a 
3-clique  is  placed  at  a  to  discover  b,  c,  d  and  e.  f  assigned  singleton  afterwards. 

In  order  to  reflect  changes  introduced  to  the  social  network,  its  underlying  graph  is 
constantly  updated  by  either  inserting  or  removing  a  node  or  a  set  of  nodes,  or  by  either 
introducing  or  deleting  an  edge  or  a  set  of  edges.  In  fact,  the  introduction  or  removal  of  a 
set  of  nodes  (or  edges)  can  be  decomposed  as  a  sequence  of  node  (or  edge)  insertions 
(or  removals),  in  which  a  single  node  (or  a  single  edge)  is  introduced  (or  removed)  at  a 
time.  This  observation  helps  us  to  treat  network  changes  as  a  collection  of  simple  events 
where  a  simple  event  can  be  one  of  newNode,  removeNode,  newEdge,  removeEdge 
whose  details  are  as  follow: 

•  newNode  (V  u  {u})\  A  new  node  u  with  its  associated  edges  are  introduced,  u 
could  come  with  no  or  more  than  one  new  edge(s). 

•  removeNode  (V\{u})\  A  node  u  and  its  adjacent  edges  are  removed  from  the 
network. 

•  newEdge  (E  u  {e}):  A  new  edge  e  connecting  two  existing  nodes  is  introduced. 

•  removeEdge  (E\{ej):  An  existing  edge  e  in  the  network  is  removed. 

Our  approach  first  requires  an  initial  community  structure  C0,  which  we  call  the 
basic  structure,  in  order  to  process  further.  Since  the  input  model  is  restricted  as  an 
undirected  unweighted  network,  this  initial  community  structure  can  be  obtained  by 
performing  any  of  the  available  static  community  detection  methods  [76][6][1 7].  To 
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Figure  2-1 .  Possible  behaviors  of  the  network  community  structure  during  evolution. 


obtain  a  good  basic  structure,  we  choose  the  method  proposed  by  Blondel  et  al.  in  [6] 
which  produces  a  good  network  community  structure  in  a  timely  manner  [58]. 

2.2.1  New  node 

Let  us  consider  the  first  case  when  a  new  node  u  and  its  associated  connections 
are  introduced.  Note  that  u  may  come  with  no  adjacent  edges  or  with  many  of  them 
connecting  one  or  more  communities.  If  u  has  no  adjacent  edge,  we  create  a  new 
community  for  it  and  leave  the  current  structure  intact.  The  interesting  case  happens, 
and  it  usually  does,  when  u  comes  with  edges  connecting  one  or  more  existing 
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communities.  In  this  latter  situation,  we  need  to  determine  which  community  u  should 
join  in  in  order  to  maximize  the  gained  modularity.  There  are  several  local  methods 
introduced  for  this  task,  for  instance  the  algorithms  of  [76][1 7],  Our  method  is  inspired  by 
a  physical  approach  proposed  in  [107],  in  which  each  node  is  influenced  by  two  forces: 
FP  (to  keep  u  stays  inside  community  C)  and  F0cut  (the  force  a  community  C  makes  in 
order  to  bring  u  to  C)  defined  as  follow: 


du(dc  -  du) 
2  M 


and 


max  { ec 

SeWC(u)  1 


du d outs 
2  M 


}■ 


where  doutS  is  of  opposite  meaning  of  ds. 

Taking  into  account  the  above  two  forces,  a  node  v  can  actively  determines  its 
best  community  membership  by  computing  those  forces  and  either  lets  itself  join  the 
community  S  having  the  highest  F|ut  >  F^v\v))  or  stays  in  the  current 

community  C(v)  otherwise.  By  Theorem  2.1 ,  we  bridge  the  connection  between  those 
forces  and  the  objective  function,  i.e.,  joining  the  new  node  in  the  community  with  the 
highest  outer  force  will  maximize  the  local  gained  modularity.  The  process  is  presented 
in  Alg.  1. 

Theorem  2.1 .  Let  C  be  the  community  having  the  maximum  F^ut(u)  when  a  new  node  u 
with  degree  p  is  added  to  G,  then  joining  u  in  C  gives  the  maximal  gained  modularity. 


Proof.  Let  D  be  a  community  of  G  and  D  ^  C,  we  show  that  joining  u  in  D  contributes 
less  modularity  than  joining  u  in  C.  The  overall  modularity  Q  when  u  joins  in  C  is 

=  mc  +  euc  ( dc  +  euc  +  p)2  mD  {dp  +  euD )2 
1  M  +  p  4  (M  +  p)2  M  +  p  4  (M  +  p)2  ’ 
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Algorithm  1  NewJSIode 

Input:  New  node  u  with  associated  links;  Current  structure  Ct. 
Output:  An  updated  structure  Ct+1 
1:  Create  a  new  community  of  only  u\ 

2:  for  \/  G  N(u)  do 

3:  Let  v  determine  its  best  community; 

4:  end  for 

5:  for  C  G  NC(u)  do 
6:  Find  F^ut(uy, 

7:  end  for 

8:  if  maxc  Fo«(«)  >  F,„C"(")  then 
9:  Let  Cu  <—  arg  maxc  (F0cut(t/)}; 

10:  Update  Ct+ 1  :  Ct+ 1  (Ct\Cu)  U  (CuUu); 

ii:  end  if 


where  A  is  the  summation  of  other  modularity  contributions.  Similarly,  joining  u  to  D 
gives 

_  mc  _  (dc  +  e^)2  +  mD  +  euD  _  (dD  +  euD  +  p)2  +  ^ 


M  +  p  4  (M  +  p)2  M  +  p 


4  (M  +  p)2 


and 


0l  _  _  1  ,  p(<k-dc  +  e£-e£). 


M  +  pv~L  ~u  '  2 (M  +  p) 

Now,  since  C  is  the  community  that  gives  the  maximum  F^ut(u),  we  obtain 


u  P(dc  +  e£)  „  p(dD  +  euD) 


ec 


>  euD  - 


which  implies 


2(M  +  p)  u  2(M  +  p) 
u  ,  P(dD  -  dc  +  euD  -  euc) 


eun  + 


2  (M  +  p) 

Hence,  Qi  -  Q2  >  0  and  thus  the  conclusion  follows. 


>  0. 


□ 


2.2.2  New  edge 

When  a  new  edge  e  =  (u,  v)  connecting  two  existing  vertices  u,  v  is  introduced, 
we  divide  it  further  into  two  subcases:  e  is  an  intra-community  link  (totally  inside  a 
community  C)  or  an  inter-community  link  (connects  two  communities  C(u)  and  C(v)). 
If  e  is  inside  a  community  C,  its  presence  will  strengthen  the  inner  structure  of  C 
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according  to  Lemma  1.  Furthermore,  by  Lemma  2,  we  know  that  adding  e  should  not 
split  the  current  community  C  into  smaller  modules.  Therefore,  we  leave  the  current 
network  structure  intact  in  this  case. 

The  interesting  situation  occurs  when  e  is  a  link  connecting  communities  C(u)  and 
C(v)  since  its  presence  could  possibly  make  u  (or  v)  leave  its  current  module  and  join 
in  the  new  community.  Additionally,  if  u  (or  v)  decides  to  change  its  membership,  it  can 
advertise  its  new  community  to  all  its  neighbors  and  some  of  them  might  eventually  want 
to  change  their  memberships  as  a  consequence.  By  Lemma  3,  we  show  that  should  u 
(or  v)  ever  change  its  community  assignment,  C(v)  (or  C(u))  is  the  best  new  community 
for  it.  But  how  can  we  quickly  decide  whether  u  (or  v)  should  change  its  membership 
in  order  to  form  a  better  community  structure  with  higher  modularity?  To  this  end,  we 
provide  a  criterion  to  test  for  membership  changing  of  u  and  v  in  Theorem  2.2.  Here,  if 
both  Aqu  c,d  and  A qv,c,D  fail  to  satisfy  the  criteria,  we  can  safely  preserve  the  current 
network  community  structure  (Corollary  1).  Otherwise,  we  move  u  (or  v)  to  its  new 
community  and  consequently  let  its  neighbors  determine  their  best  modules  to  join  in, 
using  local  search  and  swapping  to  maximize  gained  modularity.  Figure  2-1 A  describes 
the  procedure  for  this  latter  case.  The  detailed  algorithm  is  described  in  Alg.  2. 

Lemma  1.  For  any  C  e  C,  if  dc  <  M  -  1  then  adding  an  edge  within  C  will  increase  its 
modularity  contribution. 

Proof.  The  portion  Qi  that  community  C  contributes  to  the  overall  modularity  Q  is 

_  m£_  _d^_ 

WlC  M  AM2' 

When  a  new  edge  coming  in,  the  new  modularity  Q2  is 

=  mc  +  1  _  ( dc  +  2)2 
^2C  “  M  +  1  4(M  +  l)2 
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Algorithm  2  New.Edge 

Input:  Edge  {u,  v}  to  be  added;  Current  structure  Ct. 
Output:  An  updated  structure  Ct+1. 

1:  if  (u  and  v  ^  V)  then 

2:  Ct+ 1  <—  Ct  U  {u,  v } ; 

3:  else  if  C(u)  /  C(v )  then 

4:  if  &qu,c(u),c(v)  <  0  and  Aqv,c(u),c{v)  <  0  then 

5:  return  Ct+1  =  Ct; 

6:  else 

7:  w  =  arg  max{AqUiC(u),c(v),  A qv.c(u).c(v)}', 

8:  Move  1/1/  to  the  new  community; 

9:  for  t  G  N(w)  do 

10:  Let  t  determine  its  best  community; 

11:  end  for 

12:  Update  Ct+1] 

13:  end  if 

14:  end  if 


Now,  taking  the  difference  between  Q2  and  Qi  gives 


A  Qc  —  O2C  —  Ok 


4  M3  —  A-nirM2  —  4  drM2  —  AmcM  +  2  drM  +  d 2r 


> 


4M3  —  6drM2 


4(/W  +  1)2M2 
2dcM  +  2d2cM 


d2c 


> 


4(/W  +  1)2M2 
(2 M2  -  2 dcM  -  dc){2M  -  dc) 
4(M+1)2M2 


>  0 


(since  mc  <  #) 


The  last  inequality  holds  since  dc  <  M  -  1  implies  2 M2  -  2dcM  -  dc  >  0.  □ 

Lemma  2.  If  C  is  a  community  in  the  current  snapshot  of  G,  then  adding  any  intra¬ 
community  link  to  C  should  not  split  it  into  smaller  modules. 


Proof.  Assume  the  contradiction,  i.e,  C  should  be  divided  into  smaller  modules  when 
an  edge  is  added  into  it.  Let  X1:  X2,  ..,Xk  be  disjoint  subsets  of  C  representing  these 
modules.  Let  d,  and  eu  be  the  total  degree  of  vertices  inside  X,  and  the  number  of  links 
going  from  X,  to  X,,  respectedly.  Assume  that,  W.L.O.G.,  when  an  edge  is  added  inside 
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C,  it  is  added  to  Xx.  We  will  show  that 


2  M 


<  E  eu  < 


2  M  ' 


which  can  not  happen  since  eu  is  an  natural  number.  Recall  that 


Qic 


mc  dl 
~M  ~  4 M2’ 


and 


Ox  =  — 

^  A  /I 


M  AM2’ 


and  prior  to  adding  an  edge  to  C,  we  have 


Qic  >  Qx„ 


/= i 


or  equivalently, 


mc 


Jc 


\  ^  _  a,  \ 
Z-J  V  U  AA/f2  '  ’ 


M  AM 2  >  ^  4 M2' 
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Since  Xx,  X2, Xfc  are  disjoint  subsets  of  C,  it  follows  that 


dc  =  Y. d' 


i=  1 


and 


;=i 


'<y 


(where  m ,  is  the  number  of  links  inside  X().  The  above  inequality  equals  to 
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or 
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Yi<j  djdj 
2  M 


Now,  assume  that  the  new  edge  is  added  to  Xx  and  C  is  split  into  Xx,  X2, Xk  which 
implies  that  dividing  C  into  k  smaller  communities  will  increase  the  overall  modularity, 
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i.e., 

k 

Q2C  <  O2X,  ■ 

/=  1 

Now, 

k 

Q2C  <  02X; 

;=1 

S/=l  mi  +  S/<y  eij  +  1  (  S/=l  di  +2)  /7?1  +  1  (c/l  +  2)^ 


M  +  1 
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gti  ir>,  +  E,<J  + 1  ( SLi  d,  +  2) 2  <  Yti  "2  +  1  W  +  2) 


M  +  1 


4  (M  +  l2) 
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E 
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S/Li  di  -  2 di  +  XL</  didj 


2(M  +  1) 


Moreover,  since  it  is  obvious  that  XLf=i  d>  ~  2c/i  <  2 M,  we  have 


S/Li  di  -  2di  +  S,+j  didj 


2(M+  1) 

and  thus  the  conclusion  follows. 


< 


S/q  didi 

2  M 


+  1, 


□ 


Lemma  3.  l/Wien  a  new  edge  (u,  v)  connecting  communities  C(u)  and  C(v)  is  in¬ 
troduced,  C(  v)  (or  C(u))  is  the  best  candidate  for  u  (or  v)  if  it  should  ever  change  its 
membership. 


Proof.  Let  C  =  C(u)  and  D 
to  vertex  u  is 


C(v).  Recall  the  outer  force  that  a  community  S  applies 


du  douts 
2  M 


We  will  show  that  the  presence  of  edge  (u,  v)  will  strengthen  F°ut(u)  while  weakening 
the  other  outer  forces  F^ut(u ).  i  e,  we  show  that  F°t(i/)  increases  while  F§ut(u) 


34 


decreases  for  all  S  <£  {C,  D}. 


F outsold  =  (e°  +  1 


(du  +  l)(doutD  +  1) n  _  /  D  doutD  \ 

2(M+  1)  '  ~~  ^  2 M  ' 


_  2M  +  dudoutD  dudoutD  +  doutD  +  du  +  1 

2M  2(M  +  1) 

>  2  M  +  dudoutD  _  dudoutD  +  doutD  +  du  +  1 
“  2(M+  1)  2  (M  +  1) 


and  thus  F°ut(u)  is  strengthened  when  (t/,  v)  is  introduced.  Furthermore,  for  any 
community  S  e  C  and  S  ^  {C,  D},  we  have 


FoutMnew  Fout(u)0id 


v  u  2(M  +  1)  ' 

__w  (  +  1  ^  ^  n 

~a°uts[2M  2(M  +  l)j  <U 


which  implies  Fjjut(u)  is  weakened  when  (t/,  v)  is  connected.  Hence,  the  conclusion 
follows.  □ 


Theorem  2.2.  Assume  that  a  new  edge  (u,  v)  is  added  to  the  network.  Let  C  =  C(u) 
and  D  =  C(v).  If 


AQu.c.d  =  4 (M  +  l)(ep  +  1  —  euc)  +  euc{2. do  —  Zdu  —  euc)  —  2 (du  +  l)(du  +  1  +  do  —  dc)  >  0 

then  joining  u  to  D  will  increase  the  overall  modularity. 

Proof.  Node  u  should  leave  its  current  community  C  and  join  in  D  if 


Qd+u  +  Qc-u  >  Qc  +  Qd, 


or  equivalently, 


mD  +  eo  +  1  (do  +  du  +  2)2  me  —  ( dc 


M+l 


> 


4(M+1)2  +  M 

/7?d  ( do  +  l)2 


ecY 


+ 


mc 


4(M  +  l)2 
(dc  +  l)2 


M  +  l  4(M  +  l)2  M+l  4(M+1)2 
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The  above  equation  equals  to 


4(M  +  1)(sd  +  1  —  ec)  +  ec(2dD  —  2 du  —  ec )  —  2(du  +  l)(c/u  +  1  +  do  —  dc)  >  0, 

which  concludes  the  Theorem.  □ 

Corollary  1 .  If  the  condition  in  Theorem  2.2  is  not  satisfied,  then  neither  u  nor  its 
neighbors  should  be  moved  to  D. 

Proof.  The  proof  follows  from  Theorem  2.2.  □ 

2.2.3  Node  removal 

When  an  existing  node  u  in  a  community  C  is  removed,  all  of  its  adjacent  edges 
are  disregarded  as  a  result.  This  case  is  challenging  in  the  sense  that  the  resulting 
community  is  very  complicated:  it  can  be  either  unchanged  or  broken  into  smaller  pieces 
and  could  probably  be  merged  with  other  communities.  Let’s  consider  two  extreme 
cases  when  a  single  degree  node  and  a  node  with  highest  degree  in  a  community 
is  removed.  If  a  single  degree  node  is  removed,  it  leaves  the  resulted  community 
unchanged  (Lemma  5).  However,  when  a  highest  degree  vertex  is  removed,  the  current 
community  might  be  disconnected  and  broken  in  to  smaller  pieces  which  then  are 
merged  to  other  communities  as  depicted  in  Figure  2-1 C.  Therefore,  identifying  the 
leftover  structure  of  C  is  a  crucial  part  once  a  vertex  in  C  is  removed. 

To  quickly  and  efficiently  handle  this  task,  we  utilize  the  clique  percolation  method 
presented  in  [85].  In  particular,  when  a  vertex  u  is  removed  from  C,  we  place  a 
3-clique  to  one  of  its  neighbors  and  let  the  clique  percolate  until  no  vertices  in  C  are 
discovered  (Figure  2-1 D).  We  then  let  the  remaining  communities  of  C  choose  their  best 
communities  to  merge  in.  The  detailed  algorithm  is  presented  in  Alg.  3. 

2.2.4  Edge  removal 

In  the  last  case  when  an  edge  e  =  (u,  v)  is  removed,  we  divide  further  into 
four  subcases  (1)  e  is  a  single  edge  connecting  only  u  and  v  (2)  either  u  or  v  has 
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Algorithm  3  Node_Removal 

Input:  Node  u  e  C  to  be  removed;  Current  structure  Ct. 

Output:  An  updated  structure  Ct+1. 

1:  i  <—  1; 

2:  while  N(u)  ^  0  do 

3:  Sj  =  {Nodes  found  by  a  3-clique  percolation  on  v  e  N(u)}m, 

4:  if  Si  ==  0  then 

5:  5,  ■<—  {v}; 

6:  end  if 

7:  A/(u)  <-  N(u)\Sr, 

8 :  /  4 —  /  — |—  1 , 

9:  end  while 

10:  Let  each  singleton  in  N(u)  consider  its  best  communities; 

11:  Let  each  S,  consider  its  best  communities  as  in  [6] 

12:  Update  Ct\ 


degree  one  (3)  e  is  an  inter-community  link  connecting  C(u)  and  C(v)  and  (4)  e  is  an 
intra-community  link.  If  e  is  an  single  edge,  its  removal  will  result  in  the  same  community 
structure  plus  two  singletons  of  u  and  v  themselves.  The  same  reaction  applies  to  the 
second  subcase  when  either  u  or  v  has  single  degree  due  to  Lemma  5,  thus  results  in 
the  prior  network  structure  plus  u  (or  v).  When  e  is  an  inter-community  link,  the  removal 
of  e  will  strengthen  the  current  network  communities  (Lemma  4)  and  hence,  we  just 
make  no  change  to  the  overall  network  structure. 

The  last  but  most  complicated  case  happens  when  an  intra-community  link  is 
deleted.  As  depicted  in  Figure  2-1 B,  removing  this  kind  of  edge  often  leaves  the 
community  unchanged  if  the  community  itself  is  densely  connected;  however,  the 
target  module  will  be  divided  if  it  contains  substructures  which  are  less  attractive  or 
loosely  connected  to  each  other.  Therefore,  the  problem  of  identifying  the  structure 
of  the  remaining  modules  is  important.  Theorem  2.3  provides  us  a  convenient  tool  to 
test  for  community  bi-division  when  an  intra-community  link  is  removed  from  the  host 
community  C.  However,  it  requires  an  intensive  look  for  all  subsets  of  C,  which  may  be 
time  consuming  when  C  is  big.  Note  that  prior  to  the  removal  of  (u,  v),  the  community 
C  hosting  this  link  should  contain  dense  connections  within  itself  and  thus,  the  removal 
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of  (u,  v )  should  leave  some  sort  of  ‘quasi-clique’  structure  [85]  inside  C.  Therefore,  we 
find  all  maximal  quasi-cliques  within  the  current  community  and  have  them  (as  well  as 
leftover  singletons)  determine  their  best  communities  to  join  in.  The  detailed  procedure 
is  described  in  Alg.  4. 


Algorithm  4  Edge_Removal 

Input:  Edge  (u,  v)  to  be  removed;  Current  structure  Ct. 

Output:  An  updated  clustering  Ct+1. 
i:  if  (u,  v )  is  a  single  edge  then 

2:  Ct+i  =  (Ct\{u,  v})  U  {u}  U  {v}; 

3:  else  if  Either  u  (or  v)  is  of  degree  one  then 

4:  Ct+i  =  (Ct\C(u))  U  {u}  U  {C(u)\u}', 

5:  else  if  C(u)  /  C(v)  then 
6:  dt+i  =  Ct\ 

7:  else 

8:  %  Now  ( u ,  v )  is  inside  a  community  C  % 

9:  L  =  {Maximal  quasi-cliques  in  C}; 

10:  Let  the  singletons  in  C\L  consider  their  best  communities; 

ii:  end  if 

12:  Update  Ct+i\ 


Lemma  4.  If  C  and  D  are  two  communities  of  G,  then  the  removal  of  an  inter-community 
link  connecting  them  will  strengthen  modularity  contributions  of  both  C  and  D. 

Proof.  Let  Q1C  (resp.  Q1D)  and  Q2c  (resp.  Q2d)  be  the  modularities  of  C  (resp.  D) 
before  and  after  the  removal  of  that  link.  We  show  that  Q2c  >  Qic  (and  similarly, 

O2D  >  Qid)  and  thus,  C  and  D  contribute  higher  modularities  to  the  network. 


/  ml  _  (^1  -  !)2  N  _  / 
1  4(M  —  l)2  1 


O2C  —  Oic  —  ( 


Since  all  terms  are  all  positive,  Q2C  -  Qic  >  0.  The  same  technique  applies  to  show  that 


Q2D  >  Qid- 


□ 


Lemma  5.  The  removal  of(u,  v)  inside  a  community  C  where  only  u  or  v  is  of  degree 
one  will  not  separate  C. 
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Proof.  Assume  the  contradiction,  i.e.,  after  the  removal  of  (u,  v)  where  du  =  1,  C 
is  broken  into  smaller  communities  Xi,  X2,...,  Xk  which  contribute  higher  modularity: 

QXl  +  ...  +  Qxk  >  Qc ■  W.L.O.G.,  suppose  u  was  connected  to  Xi  prior  to  its  removal.  It 
follows  that  QXl+u  >  Ox!  and  thus  QXl+u  +  ...  +  QXk  >  Qc,  which  raises  a  contradiction 
since  C  is  originally  a  community  of  C.  □ 

Lemma  6.  (Separation  of  a  community)  Let  Q  c  C  and  C2  =  C\Q  be  two  disjoint 
subsets  of  C.  (C\C)  u  {Ci,  C2}  is  a  community  structure  with  higher  modularity  when  an 
edge  crossing  Q  and  C2  is  removed,  i.e.,  C  should  be  separated  into  Q  and  C2,  if  and 
only  if  e12  <  d%d_c^1  +  1- 

Proof.  Let  qu  <72  and  qc  denote  the  modularity  contribution  of  Clt  C2  and  C  after  an 
edge  crossing  (Q,  C2)  has  been  removed.  Now, 

d1d2  —  dc  +  1  2d\d2  —  2dc  +  2  ei2  —  1 

12  2(M  —  1)  4(M  —  l)2  M-  1 

(d,  +  d2-  2)2  _  (d,  -  l)2  _  (d2  -  l)2 
4(M  —  l)2  4(M  —  l)2  4(M  —  l)2 

mi  +  m2  +  ei2  —  1  m\  —  1  m2  —  1 

>  M  -  1  M  -  1  _  M  -  1 

mi  -  1  _  (di  -  l)2  m2  -  1  _  (d2  -  l)2 
M  —  1  4(/W-l)2+  M  4(M  —  l)2 

^  /t?i  +  m2  +  ei2  —  1  (c/i  +  c/2  —  2) 2 

>  M  -  1  4(M  —  l)2 

<7i  +  Q2  >  dc- 

Thus,  the  conclusion  follows.  □ 

Theorem  2.3.  (Community  bi-division)  For  any  community  C,  let  a  and  [3  be  the  lowest 
and  the  second  highest  degree  of  vertices  in  C,  respectively.  Assume  that  an  edge  e  is 
removed  from  C.  If  there  do  not  exist  subsets  Q  c  C  and  C2  =  C\Q  sl/c/i  that  e  is 
crossing  Q  and  C2  and  mm  Wdc-o).^(t/c-^)}  ei2  +  i_  then  any  bi-division  of  C 

will  not  benefit  the  overall  Q. 
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Proof.  From  Lemma  6,  it  follows  that  in  order  to  really  benefit  the  overall  modularity  we 
must  have 

d\d2  d\d2  +  1 

~2M  <  612  <  2(M  —  1)  +  ' 

Now  we  find  an  upper  bound  for  the  RHS  inequality.  Since  dt  +  d2  =  dc,  it  follows  that 


ei2  < 


d\d2  -  dc  +  1 
2(M  —  1) 


(dl  +  d2)2 


1  < 


-  dc  +  1 


<  ^  ~  dc  +  1  +  1  = 
-  2(M  —  1) 


2  (M 

(dc~  2)2 
8(/W  —  1) 


1) 


+  1 


+  1 


For  a  lower  bound  of  the  LHS  inequality,  we  rewrite  did2  as  did2  =  di(dc  -  di)  = 
didc  -  d\  and  find  the  non-zero  minimum  value  on  the  range  dx  e  [a,  /3].  In  this  interval, 
didc  -  di  is  minimized  either  at  di  =  a  or  di  =  /3.  Therefore, 

min{g(dc -ai),p(dc  -  P)}  <  dxd2  <  (dc  -  2)2 

2 M  ~  2 M  612  “  8(M  —  1)  + 


□ 


Finally,  our  QCA  method  for  quickly  updating  the  network  community  structure  is 
presented  in  Alg.  5. 


Algorithm  5  Quick  Community  Adaptation  (QCA) 

Input:  G  =  G0  =  (Vo,  E0),  £  =  {£1,82,  Es}  a  collection  of  simple  events 
Output:  Community  structure  Ct  of  at  time  t. 

1:  Use  [6]  to  find  an  initial  community  clustering  C0  of  G0; 

2:  for  (t  <-  1  to  s)  do 
3:  Ct  <r-  Ct-!\ 

4:  if  £t  =  newNode(u)  then 

5:  New_Node(Ct,  u)m, 

6:  else  if  £t  =  newEdge((u,  v))  then 

7:  New_Edge(Ct,  (u,  v)); 

8:  else  if  £t  =  removeNode(u)  then 

9:  Remove _Node(Ct,  u)\ 

10:  else 

11:  Remove_Edge(Ct,  (u,  v)); 

12:  end  if 

13:  end  for 
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A  N  =  1000,  /t  =  0.1  B  N  =  1000,  fi  =  0.3 


C  N  =  5000,  /i  =  0.1 


D  N  =  5000,  /i  =  0.3 


Figure  2-2.  NMI  scores  on  synthesized  networks  with  known  communities 


2.3  Experimental  Results 

In  this  section,  we  first  validate  our  approaches  on  different  synthesized  networks 
with  known  groundtruths,  and  then  present  our  findings  on  real  world  traces  including 
the  Enron  email  [98],  arXiv  eprint  citation  [22],  and  Facebook  social  networks  [100]. 

To  certify  the  performance  of  our  algorithms,  we  compare  QCA  to  other  adaptive 
community  detection  methods  including  (1)  MIEN  algorithm  proposed  by  Thang  et 
al.  [26],  (2)  FacetNet  framework  proposed  by  Lin  et  al.  [70],  and  (3)  OSLOM  method 
suggested  by  Lancichinetti  et  al.  [60]. 
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A  N  =  1000,  /x  =  0.1 


B  N  =  1000,  n  =  0.3 


C  N  =  5000,  (j,  =  0.1  D  N  =  5000,  n  =  0.3 


Figure  2-3.  Modularity  values  on  synthesized  networks  with  known  communities 


2.3.1  Results  on  synthesized  networks 

Of  course,  the  best  way  to  evaluate  our  approaches  is  to  validate  them  on  real 
networks  with  known  community  structures.  Unfortunately,  we  often  do  not  know  that 
structures  beforehand,  or  such  structures  cannot  be  easily  mined  from  the  network 
topology.  Although  synthesized  networks  might  not  reflect  all  the  statistical  properties  of 
real  ones,  they  provide  us  known  ground  truths  via  planted  communities,  and  the  ability 
to  vary  other  parameters  such  as  sizes,  densities  and  overlapping  levels,  etc.  Testing 
community  detection  methods  on  generated  data  has  become  an  usual  practice  that  is 
widely  accepted  in  the  field  [58].  Hence,  comparing  QCA  with  other  dynamic  methods 
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on  synthesized  networks  not  only  certifies  its  performance  but  also  provides  us  the 
confidence  to  its  behaviors  on  real  world  traces. 

Setup.  We  use  the  well-known  LFR  benchmark  [58]  to  generate  40  networks  with 
10  snapshots.  Parameters  are:  the  number  of  nodes  N  =  {1000, 5000},  the  mixing 
parameter  /i  =  {0.1, 0.3}  controlling  the  overall  sharpness  of  the  community  structure. 

In  order  to  quantify  the  similarity  between  the  identified  communities  and  the  ground 
truth,  we  adopt  a  well  known  measure  in  Information  Theory  called  Normalized  Mutual 
Information  (NMI).  NMI  has  been  proven  to  be  reliable  and  is  currently  used  in  testing 
community  detection  algorithms  [58].  Basically,  NMI(U,  V )  equals  1  if  structures  U  and 
V  are  identical  and  equals  0  if  they  are  totally  separated,  and  the  higher  NMI  the  better. 

Results.  The  NMI  and  Modularity  values  are  reported  in  Figures  2-2  and  2-3. 

As  depicted  in  their  subfigures,  the  NMI  values  and  modularities  indicated  by  our 
QCA  method,  in  general,  are  very  high  and  competitive  with  those  of  OSLOM  while 
are  much  better  than  those  produced  by  MIEN  and  FacetNet  methods.  On  these 
generated  networks,  we  observe  that  MIEN  and  FacetNet  perform  well  when  the  mixing 
parameter  /i  is  small,  i.e.,  when  the  network  community  structures  are  clear,  however, 
their  performances  degrade  dramatically  when  these  structures  become  less  clear  as 
H  gets  larger.  Particularly,  MIEN’  and  FacetNet’  NMI  scores  and  modularities  in  all  test 
cases  are  fairly  low  and  usually  from  10%  to  50%  and  5%  to  15%  worst  than  those 
produced  by  QCA.  This  implies  the  network  communities  revealed  by  these  methods  are 
not  as  high  similarity  to  the  ground-truth  as  QCA  algorithm.  On  the  generated  networks, 
OSLOM  algorithm  performs  very  well  as  suggested  through  its  high  NMI  scores  and 
modularity  values.  In  particular,  OSLOM  tends  to  perform  better  than  QCA  in  the  first 
couple  of  network  snapshots,  however,  its  performance  is  taken  over  by  QCA  when  the 
networks  evolve  over  time,  especially  at  the  end  of  the  evolution  where  OSLM  reveals 
big  gaps  in  similarity  to  the  planted  network  communities  (Note  that  the  higher  NMI 
score  at  the  end  of  the  evolution,  the  better  the  final  detected  community  structure).  This 
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concludes  that  the  network  communities  discovered  by  QCA  are  of  the  best  similarity  to 
ones  planted  in  the  ground-truth  in  comparison  with  other  methods. 

2.3.2  Results  on  real-world  traces 

We  next  present  the  results  of  QCA  algorithms  on  real  world  dynamic  social 
networks  including  ENRON  email  [98],  arXiv  e-print  citation  [22],  and  Facebook 
networks  [100].  Due  to  the  lack  of  appropriate  communities  corresponding  to  these 
traces,  we  report  the  performance  of  the  aforementioned  algorithms  in  reference  to  the 
static  method  proposed  by  Blondel  et  al.  [6].  In  particular,  we  will  show  the  following 
quantities  (1)  modularity  values,  (2)  the  quality  of  the  identified  network  communities 
through  NMI  scores,  and  (3)  the  processing  time  of  our  QCA  in  comparison  with  other 
methods.  The  above  networks  possess  to  contain  strong  community  structures  due  to 
their  high  modularities,  which  was  the  main  reason  for  them  to  be  chosen. 

For  each  network,  time  information  is  first  extracted  and  a  portion  of  the  network 
data  (usually  the  first  snapshot)  is  then  collected  to  form  the  basic  network  community 
structure.  Our  QCA  method  (aslo  MIEN  and  OSLOM)  take  into  account  that  basic 
community  structure  and  run  on  the  network  changes  whereas  the  static  method  has 
to  be  performed  on  the  whole  network  snapshot  for  each  time  point.  In  this  experiment, 
FacetNet  method  does  not  appear  to  complete  the  tasks  in  a  timely  manner,  and  is  thus 
excluded  from  the  plots. 

ENRON  email  network 

Data.  The  Enron  email  network  contains  email  messages  data  from  about  150 
users,  mostly  senior  management  of  Enron  Inc.,  from  January  1999  to  July  2002 
[98].  Each  email  address  is  represented  by  an  unique  ID  in  the  dataset  and  each  link 
corresponds  to  a  message  between  the  sender  and  the  receiver.  After  a  data  refinement 
process,  we  choose  50%  of  total  links  to  form  a  basic  community  structure  of  the 
network  with  7  major  communities,  and  simulate  the  network  evolution  via  a  series  of  21 
growing  snapshots. 
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A  Modularity 


B  Number  of  Communities 
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C  Running  Time(s) 
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Figure  2-4.  Simulation  results  on  Enron  email  network. 


Results.  We  first  evaluate  the  modularity  values  computed  by  QCA,  MIEN, 
OSLOM,  and  Blondel  methods.  As  shown  in  Figure  2  -  4 A,  our  QCA  algorithm  archives 
competitively  higher  modularities  than  the  static  method  but  a  little  bit  less  than  MIEN, 
and  is  far  better  than  those  obtained  by  OSLOM.  Moreover,  QCA  also  successes  in 
maintaining  the  same  numbers  of  communities  of  the  other  two  methods  MIEN  and 
Blondel  while  OSLOM’s  are  vague  (Figure  2  -  4B).  In  particular,  the  modularity  values 
produced  by  QCA  very  well  approximate  those  found  by  static  method  with  lesser 
variation.  There  are  reasons  for  that.  Recall  that  our  QCA  algorithm  takes  into  account 
the  basic  community  structures  detected  by  the  static  method  (at  the  first  snapshot)  and 
processes  on  network  changes  only.  Knowing  the  basic  network  community  structure 
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Figure  2-5.  Simulation  results  on  arXiv  e-print  citation  network. 

is  a  great  advantage  of  our  QCA  algorithm:  it  can  avoid  the  hassle  of  searching  and 
computing  from  scratch  to  update  the  network  with  changes.  In  fact,  QCA  uses  the  basic 
structure  for  finding  and  quickly  updating  the  local  optimal  communities  to  adapt  with 
changes  introduced  during  the  network  evolution. 

The  running  time  of  QCA  and  the  static  method  in  this  small  network  are  relatively 
close:  the  static  method  requires  one  second  to  complete  each  of  its  tasks  while  our 
QCA  does  not  even  ask  for  one  (Figure  2  -  4C).  In  this  dataset,  MIEN  and  OSLOM 
requires  a  little  more  time  (1 .5  and  2.4  seconds  in  average  for  MIEN  and  OSLOM)  to 
complete  their  tasks.  Time  and  computational  cost  are  significantly  reduced  in  QCA 
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Figure  2-6.  Simulation  results  on  Facebook  social  network. 


since  our  algorithms  only  take  into  account  the  network  changes  while  the  static  method 
has  to  work  on  the  whole  network  every  time. 

As  reported  in  Figure  2  AD,  both  the  NMI  scores  of  ours  and  MIEN  method 
are  very  high  and  relatively  close  to  1  while  those  obtained  by  OSLOM  fall  short  and 
are  far  from  stable.  These  results  indicate  that  in  this  Enron  email  network,  both  QCA 
and  MIEN  algorithms  are  able  to  identify  high  quality  community  structure  with  high 
modularity  and  similarity;  however,  only  our  method  significantly  reduces  the  processing 
time  and  computational  requirement. 
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arXiv  e-print  citation  network 

Data.  The  arXiv  e-print  citation  network  [22]  has  become  an  essential  mean  of 
assessing  research  results  in  various  areas  including  physics  and  computer  sciences. 
This  network  contained  more  than  225K  articles  from  January  1996  to  May  2003.  In 
our  experiments,  citation  links  of  the  first  two  years  1996  and  1997  were  used  to  form 
the  basic  community  structure  of  our  QCA  method.  In  order  to  simulate  the  network 
evolution,  a  total  of  30  time  dependent  snapshots  are  created  on  a  two-month  regular 
basis  from  January  1998  to  January  2003. 

Results.  We  compare  modularity  results  obtained  by  QCA  algorithm  at  each 
network  snapshot  to  Blondel  as  well  as  to  MIEN  and  OSLOM  methods.  It  reveals  from 
Figure  2-5A  that  the  modularities  returned  by  QCA  are  very  close  to  those  obtained 
by  the  static  method  with  much  more  stabler  and  are  far  higher  than  those  obtained 
by  OSLOM  and  MIEN.  In  particular,  the  modularity  values  produced  by  QCA  algorithm 
cover  from  94%  up  to  100%  that  of  Blondel  method  and  from  6%  to  1 0%  higher  than 
MIEN  and  at  least  1 .5x  better  than  OSLOM.  In  this  citation  networks,  the  numbers  of 
communities  detected  by  OSLOM  take  off  with  more  than  1200  whereas  those  found  by 
QCA,  MIEN  and  Blondel  methods  are  relatively  small  (Figure  2-5B).  Our  QCA  method 
discovers  more  communities  than  both  Blondel  and  MIEN  as  the  network  evolves 
and  this  can  be  explained  based  on  the  resolution  limit  of  modularity  [35]:  the  static 
method  might  disregard  some  small  communities  and  tend  to  combine  them  in  order  to 
maximize  the  overall  network  modularity. 

A  second  observation  on  the  running  time  shows  that  QCA  outperforms  the  static 
method  as  well  as  its  competitor  MIEN:  QCA  takes  at  most  2  seconds  to  complete 
updating  the  network  structure  while  Blondel  method  requires  more  than  triple  that 
amount  of  time,  MIEN  and  OSLOM  asks  for  more  than  5  times  (Figure  2  -  5C).  In 
addition,  higher  NMI  scores  of  QCA  than  MIEN’s  and  especially  OSLOM’s  scores 
(Figure  2  -  5 D)  implies  network  communities  identified  by  our  approach  are  not  only 
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of  high  similarity  to  the  ground  truth  but  also  more  precise  than  that  detected  by  MIEN, 
while  the  computational  cost  and  the  running  time  are  significantly  reduced. 

Facebook  social  network 

Data.  This  dataset  contains  friendship  information  among  New  Orleans  regional 
network  on  Facebook  [100],  spanning  from  September  2006  to  January  2009  with 
more  than  60K  nodes  (users)  connected  by  more  than  1.5  million  friendship  links.  In 
our  experiments,  nodes  and  links  from  September  2006  to  December  2006  are  used 
to  form  the  basic  community  structure  of  the  network,  and  each  network  snapshot  is 
recored  after  every  month  during  January  2007  to  January  2009  for  a  total  of  25  network 
snapshots. 

Results.  The  evaluation  depicted  in  Figure  2-6 A  reveals  that  QCA  algorithm 
achieves  competitive  modularities  in  comparison  with  the  static  method,  and  again  far 
better  than  those  obtained  by  MIEN  and  OSLOM  method,  especially  in  comparison  with 
OSLOM  whose  perform  was  nice  on  synthesized  networks.  In  the  general  trend,  the 
line  representing  QCA  results  closely  approximates  that  of  the  static  method  with  much 
more  stability.  Moreover,  the  two  final  modularity  values  at  the  end  of  the  experiment  are 
relatively  the  same,  which  means  that  our  adaptive  method  performs  competitively  with 
the  static  method  running  on  the  whole  network. 

Figure  2-6 C  describes  the  running  time  of  the  three  methods  on  the  Facebook 
data  set.  As  one  can  see  from  this  figure,  QCA  takes  at  least  3  seconds  and  at  most 
4.5  seconds  to  successfully  compute  and  update  every  network  snapshot  whereas  the 
static  method,  again,  requires  more  than  triple  processing  time.  MIEN  and  OSLOM 
methods  really  suffer  on  this  large  scale  network  when  requiring  more  than  lOx  and 
1  lx  that  amounts  of  QCA  running  times.  In  conclusion,  high  NMI  and  modularity  scores 
together  with  decent  executing  times  on  all  test  cases  confirm  the  effectiveness  of 
our  adaptive  method,  especially  when  applied  to  real  world  social  networks  where  a 
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centralized  algorithm,  or  other  dynamic  algorithms,  may  not  be  able  to  detect  a  good 
network  community  structure  in  a  timely  manner. 

However,  there  is  a  limitation  of  QCA  algorithm  we  observe  on  this  large  network 
and  want  to  point  out  here:  As  the  the  duration  of  network  evolution  lasts  longer  over 
time  (i.e.,  the  number  of  network  snapshots  increases),  our  method  tends  to  divide  the 
network  into  smaller  communities  to  maximize  the  local  modularity,  thus  results  in  an 
increasing  number  of  communities  and  a  decreasing  of  NMI  scores.  Figure  2-6 B  and 
2-6 D  describes  this  observation.  For  instance,  at  snapshot  12  (a  year  after  December 
2006),  the  NMI  score  is  approximately  1/2  and  continues  decaying  after  this  time  point. 
It  implies  a  refreshment  of  network  community  structure  is  required  at  this  time,  after  a 
long  enough  duration.  This  is  reasonable  since  activities  on  an  online  social  network, 
especially  on  Facebook  social  network,  tend  to  come  and  go  rapidly  and  local  adaptive 
procedures  are  not  enough  to  reflect  the  whole  network  topology  over  a  long  period  of 
time. 
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CHAPTER  3 

OVERLAPPING  COMMUNITY  STRUCTURE  DETECTION 

In  this  chapter,  we  present  AFOCS,  an  adaptive  framework  to  discover  and  trace  the 
evolution  of  network  communities  in  dynamic  complex  systems.  In  section  3.1 ,  we  first 
state  the  problem  definition  including  basic  notations  and  the  dynamic  network  model. 
Next,  we  present  the  procedure  to  detect  the  basic  community  structure  in  section  3.2, 
and  then  our  AFOCS  framework  to  update  and  trace  the  community  structure  evolution 
over  time  in  section  3.3.  Finally,  we  demonstrate  the  empirical  results  in  section  3.4. 

3.1  Problem  Formulation 

3.1.1  Basic  notations 

Let  G  =  (V,  E)  be  an  undirected  unweighted  graph  representing  the  network 
where  V  is  the  set  of  N  nodes  and  E  is  the  set  of  M  connections.  Denote  by  C  = 

{Ci,  C2 _ Ck}  the  network  community  structure,  i.e.,  a  collection  of  subsets  of  V 

where  each  C,  e  C  and  its  induced  subgraph  form  a  community  of  G.  In  contrast  with 
the  disjoint  community  structure,  we  allow  C,  n  Q  /  0  so  that  network  communities 
can  overlap  with  each  other.  For  a  node  u  e  V,  let  du,  N(u )  and  Com(u)  denote  its 
degree,  its  neighbors  and  its  set  of  community  labels,  respectively.  For  any  C  c  V, 
let  Cin  and  Cout  denote  the  set  of  links  having  both  endpoints  in  C  and  the  set  of  links 
having  exactly  one  endpoint  in  C,  respectively.  Finally,  the  terms  node-vertex  as  well  as 
edge-link-connection  are  used  interchangeably. 

3.1.2  Dynamic  network  model 

Let  G0  =  (Vo,  E0)  be  the  original  input  network  and  Gt  =  ( Vtl  Et)  be  a  time 
dependent  network  snapshot  recorded  at  time  t.  Denote  by  AVt  and  A Et  the  sets  of 
nodes  and  edges  to  be  added  to  or  removed  from  the  network  at  time  t.  Furthermore, 
let  A Gt  =  (AVt,  AEt)  describe  this  change  in  terms  of  the  whole  network.  The  network 
snapshot  at  next  time  step  t  +  1  is  expressed  as  a  combination  of  the  previous  one 
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Figure  3-1.  Overlapped  v.s.  non-overlapped  community  structures. 

together  with  the  change,  i.e.,  Gt+1  =  Gt  u  A Gt.  Finally,  a  dynamic  network  Q  is  defined 
as  a  sequence  of  network  snapshots  changing  over  time:  Q  =  (G0,  Glt  G2, ...). 

3.1.3  Density  function 

In  order  to  quantify  the  goodness  of  an  identified  community,  we  use  the  popular 
density  function  v|/  [33]  defined  as:  vj/(c)  =  where  C  c  V.  Unlike  the  case  of 

disjoint  community  structure,  in  which  the  number  of  connections  crossing  communities 
should  be  less  than  those  inside  them,  our  objective  does  not  take  into  account  the 
number  of  out-going  links  from  each  community.  To  understand  the  reason,  let  us 
consider  a  simple  example  pictured  in  Figure  3  1.  In  the  overlapping  community 

structure  point  of  view,  it  is  clear  that  every  clique  should  form  a  community  on  its  own, 
and  each  community  shares  with  the  central  clique  exactly  one  node.  However,  in  the 
disjoint  community  structure  point  of  view,  any  vertex  at  the  central  clique  has  n  internal 
and  2/7  external  connections,  which  violates  the  concept  of  a  community  in  the  strong 
sense.  Furthermore,  the  internal  connectivity  of  the  central  clique  is  also  dominated  by 
its  external  density,  which  implies  the  concept  of  a  community  in  weak  sense  is  also 
violated.  (A  community  C  is  in  a  weak  sense  if  |C'"|  >  |Cout|,  and  in  a  strong  sense  if 
any  node  in  C  has  more  links  inward  than  outward  C  [91]). 
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In  order  to  set  up  a  threshold  on  the  internal  density  that  suffices  for  a  set  of  nodes 
C  to  be  a  local  community,  we  propose  a  function  r(C)  defined  as  follows: 


Here  <r(C)  is  the  threshold  on  the  number  of  inner  connections  that  suffices  for  C  to  be 
a  local  community.  Particularly,  a  subgraph  induced  by  C  is  a  local  community  iff  x|/(C) 

>  t(C)  or  equivalently  \Cin\  >  a(C ).  Several  functions  with  the  same  purpose  have  been 
introduced  in  the  literature,  for  instance,  in  the  work  of  [56][62],  and  it  is  worth  noting 
down  the  main  differences  between  them  and  ours.  First  and  foremost,  our  functions 
r(C)  and  a(C )  locally  process  on  the  candidate  community  C  only  and  neither  require 
any  predefined  thresholds  or  user-input  parameters.  Secondly,  by  Proposition  3.1,  a(C) 
and  t(C)  are  increasing  functions  and  closely  approach  C’s  full  connectivity  as  well  as 
its  maximal  density.  That  makes  a(C)  and  r(C)  relaxation  versions  of  the  traditional 
density  function,  yet  useful  ones  as  we  shall  see  in  the  experiments. 

Proposition  3.1.  The  function  f(n)  =  n1  ~  <>  is  strictly  increasing  for  n  >  4  and 
lim^oo  f(/7)  =  n. 

3.1.4  Objective  function 

Our  objective  is  to  find  a  community  assignment  for  the  set  of  nodes  \/  which 
maximizes  the  overall  internal  density  function  x|/(C)  =  ^(O  since  the  higher 

the  internal  density  of  a  community  is,  the  clearer  its  structure  would  be.  Although  our 
objective  puts  more  focus  on  the  internal  edges  and  less  focus  on  the  external  edges, 
these  external  edges  are  not  completely  ignored  but  are  considered  in  the  following 
senses:  they  will  be  tested  later  for  the  formation  of  another  community  if  the  number 
of  edges  suffices.  Only  when  these  external  edges  are  really  sparse,  they  will  not  be 
considered. 
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3.1.5  Problem  definition 


Given  a  dynamic  network  Q  =  (G0,  Gi,  G2, ...)  where  G0  is  the  input  network  and 
Gi,  G2, ...  are  network  snapshots  obtained  through  a  collection  of  network  topology 
changes  AGi,  AG2, ...  over  time.  The  problem  asks  for  an  adaptive  framework  to 
efficiently  detect  and  update  the  network  overlapping  community  structure  Ct  at  any 
time  point  t  by  only  utilizing  the  information  from  the  previous  snapshot  Ct-X,  as  well  as 
tracing  the  evolution  of  the  network  communities. 

In  the  next  section,  we  present  our  main  contribution:  an  adaptive  framework  for 
(1)  identifying  basic  overlapped  community  structure  in  a  network  snapshot  and  (2) 
updating  as  well  as  tracing  the  evolution  of  the  network  communities  in  a  dynamic 
network  model.  First,  we  describe  FOCS,  a  procedure  to  identify  the  basic  communities 
in  a  static  network,  and  then  discuss  in  great  detail  how  AFOCS  adaptively  updates 
these  basic  communities  to  cater  with  the  evolution  of  the  dynamic  network. 

3.2  Basic  Community  Structure  Detection 
We  describe  FOCS,  the  first  phase  of  our  framework  that  quickly  discovers  the 
basic  overlapping  network  community  structure.  In  general,  FOCS  works  toward  the 
classification  of  network  nodes  into  different  groups  by  first  locating  all  possible  densely 
connected  parts  of  the  network  (3.2.1),  and  then  combining  those  who  highly  overlap 
with  each  other,  i.e.,  those  share  a  significant  substructure  (3.2.2).  Finally,  a  final 
refinement  to  group  unassigned  nodes  into  different  communities  is  conducted  in  (3.2.3). 

In  FOCS,  /3  (the  input  overlapping  threshold)  defines  how  much  substructure  two 
communities  can  share.  Note  that  FOCS  fundamentally  differs  from  [1]  in  the  way  it 
allows  |  Q  n  C/|  >  2  for  any  subsets  Q,  C,  of  \/,  and  consequently  allows  network 
communities  to  overlap  not  only  at  a  single  vertex  but  also  at  a  part  of  the  whole 
community. 
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Figure  3-2.  Locating  and  merging  local  communities. 


3.2.1  Locating  local  communities 

Local  communities  are  connected  parts  of  the  network  whose  internal  densities  are 
greater  than  a  certain  level.  In  FOCS,  this  level  is  automatically  determined  based  on 
the  function  r()  and  the  size  of  each  corresponding  part.  Particularly,  a  local  community 
is  defined  based  on  a  connection  (u,  v)  when  the  number  of  internal  connections  in  the 
subgraph  induced  by  C  =  {u,  v}  u  (N(u)  n  N(v))  exceeds  a(C),  or  in  other  words,  when 
M/(C)  >  r(C)  as  illustrated  in  Figure  3-2A.  Here,  (a)  A  local  community  C  defined  by  a 
link  (u,  v ).  Here  v|/(c)  =  0.9  >  r(C)  =  0.794  (b)  Merging  two  local  communities  sharing 
a  significant  substructure  (OS  score  =  1.027  >  /3  =  0.8). 

However,  there  is  a  problem  that  might  eventually  arise:  the  containment  of  sub 
communities  in  an  actual  bigger  one.  Intuitively,  one  would  like  to  detect  a  bigger 
community  unified  by  smaller  ones  if  the  bigger  community  is  itself  densely  connected. 

In  order  to  filter  this  undesired  case,  we  impose  v|/((J-=1  C()  <t(U-=1C,-)  Vs  =  l...|C| 
(note  that  some  of  these  unifications  do  not  contain  all  the  nodes).  In  addition,  we  allow 
this  locating  procedure  to  skip  over  tiny  communities  of  size  less  than  4.  This  condition 
is  carried  out  from  Proposition  3.1.  This  makes  sense  in  terms  of  mobile  or  social 
networks  where  a  group  of  mobile  devices  or  a  social  community  usually  has  size  larger 
than  3,  and  intuitively  agrees  with  the  finding  of  [34][66].  Thus,  the  condition  \  C\  >  4  is 
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imposed  for  any  community  C  we  discuss  hereafter.  The  tiny  communities  will  then  be 
identified  later.  Alg.  6  describes  this  procedure. 


Algorithm  6  Locating  local  communities 

Input:  G  =  (V,  E ) 

Output:  A  collection  of  raw  communities  Cr. 

1 :  Cr  i —  0; 

2:  for  ((u,  v )  G  E)  do 

3:  if  ( Com(u )  D  Com(v)  =  0)  then 

4:  C  G-  {u,  v}  U  N(u)  fl  N(v)] 

5:  if  (| C"1!  >  o-(C)  and  | C|  >  4)  then 

6:  Check  C’s  connectivity  if  |  C|  =  5; 

7:  Define  C  a  local  community; 

8:  /‘Include  C  into  the  raw  community  structure*/ 

9:  Cr  i —  Cr  U{C}; 

10:  end  if 

ii:  end  if 

12:  end  for 


Lemma  7.  All  local  communities  C’s  detected  by  Alg.  6  satisfy  v|/(c)  >  r(4)  w  0.74. 
Furthermore,  other  communities  satisfying  these  conditions  will  also  be  detected  by  Alg. 

6. 

Proof.  Alg.  6  will  examine  every  edge  (u,  v)  e  E  (except  those  whose  endpoints  are 
already  in  the  same  community),  and  by  this  greedy  nature,  any  local  community  it 
detects  has  |C|  >  4  and  vj/(c)  >  r(C)  >  r( 4)  «  0.74. 

We  now  show  that  any  community  C  statisfying  |C|  >  4  and  v|/(C)  >  r(C)  >  r( 4) 
will  also  be  detected  by  Alg.  6.  Suppose  otherwise,  that  is  there  exists  a  community  C 
satisfying  these  two  conditions  and  is  not  detected  by  Alg.  6.  To  prove  that  this  is  not  the 
case,  we  do  the  following:  (1 )  Construct  a  community  D  which  is  not  detected  by  Alg.  6 
with  \D\  =  n  =  |C|  and  vj/(D)  is  maximized,  and  (2)  show  that  M/(D)  <  r(D). 

Because  \D\  =  |C|,  it  implies  r(D)  =  r(C).  However,  since  v|/(D)  is  maximized, 
v|/(D)  >  v|/(c)  which  in  turn  implies  vj/(c)  <  v|/(D)  <  r(D)  =  r(C).  This  raises  a 
contradiction  to  our  original  assumption,  and  thus  concludes  the  proof. 
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To  construct  D,  we  do  as  follow  (i)  make  D  a  clique  of  size  n,  and  (ii)  remove  edges 
from  D  one  by  one  until  D  cannot  be  detected  by  Alg.  6.  By  doing  in  this  way,  vj/(D)  is 
maximized  iff  the  number  of  removed  edges  is  minimized. 

It  is  easy  to  find  the  least  number  of  edges  we  have  to  remove  from  D  is  n/2  if  n 
is  even  and  n/2  -  1  if  n  is  odd.  Therefore,  mD  =  n(n  —  l)/2  —  n/2  if  n  is  even,  and 
mD  =  n(n  -  l)/2  -  ( n  -  l)/2  if  n  is  odd.  Now,  vj/(D)  <  r(D)  iff  mD  <  |_et 

f(n)  be  the  difference  between  the  left  and  the  right  hand  sides,  we  show  that  f(n)  <  0 
as  n  increases.  Taking  the  derivative  of  f(n)  gives  Sf( 4)  <  0  and  f(n)  <  m  <  0  for 
all  even  n  >  4,  and  5f( 7)  <  0  and  f(n)  <  f(7)  <  0  for  all  odd  n  >  7.  When  n  =  5, 
f(  5)  >  0  but  this  is  the  only  exception  and  thus,  can  be  handled  easily  in  line  6  of  Alg.  6. 
Therefore,  we  have  vj/(D)  <  r(D),  and  hence,  the  conclusion  follows.  □ 

Theorem  3.1.  The  local  community  structure  Cr  detected  by  Alg.  6  satisfies  Ur(Cr)  > 
r(4)  x  Ur  (OPT)  where  OPT  is  the  optimal  dense  community  assignment  satisfying 
U/(S)  >  r(4)  for  any  S  e  OPT. 

Proof.  Let  Cr  be  the  local  community  structure  returned  by  Alg.  6,  and  OPT  be  the 
optimal  solution  of  the  dense  community  assignment  satisfying  M/(S)  >  r( 4)  for  any 
S  g  OPT.  Let  k  =  |OPT|.  Clearly  U '(OPT)  <  k.  By  Lemma  7,  we  know  that  Alg.  6  can 
detect  as  many  communities  as  OPT  but  probably  with  less  internal  density.  Moreover, 
since  Alg.  6  only  skips  over  edges  in  a  community,  it  ensures  that  no  real  community  is 
a  substructure  of  a  bigger  one.  Hence,  we  have  U t(Cr)  >  r(4)  x  k  «  0.74  x  U '(OPT). 
This  also  implies  that  Alg.  6  is  an  0.74-approximation  algorithm  for  finding  local  densely 
connected  communities.  □ 

Lemma  8.  The  time  complexity  of  Alg.  6  is  O(dM)  where  d  =  maxvev  dv. 

Proof.  Time  to  examine  an  edge  (u,  v)  is  |A/(r/)|  +  |A/(v)|  =  du  +  dv.  However,  when  u  and 
v  are  in  the  same  community,  ( u ,  v)  will  be  skipped.  Therefore,  the  total  time  complexity 
is  upper  bounded  by  dJ2uev  du  =  O(dM).  □ 
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3.2.2  Combining  overlapping  communities 

After  Alg.  6  finishes,  the  raw  network  community  structure  is  pictured  as  a  collection 
of  (possibly  overlapped)  dense  parts  of  the  network  together  with  outliers.  As  some  of 
those  dense  parts  can  possibly  share  significant  substructures,  we  need  to  merge  them 
if  they  are  highly  overlapped.  To  this  end,  we  introduce  the  overlapping  score  of  two 
communities  defined  as  follow 

I/.. I  |  //"  I 

OS(C  Cl  = _ — _ i - Lfd _ . _ 

1  min{| Cj\,  | Cj\}  min{| Q"|,  | C-n\} 

where  /(J  =  C,  n  Q.  Basically,  OS(C,,  Q)  values  how  important  the  common  nodes  and 
links  shared  between  C,  and  Q  mean  to  the  smaller  community.  In  comparison  with  the 
distance  metric  suggested  in  [63],  our  overlapping  score  not  only  takes  into  account  the 
fraction  of  common  nodes  but  also  values  the  fraction  of  common  connections,  which 
is  crucial  in  order  to  combine  network  communities.  Furthermore,  OS(-,  •)  is  symmetric 
and  scales  well  with  the  size  of  any  community,  and  the  higher  the  overlapping  score, 
the  more  those  communities  in  consideration  should  be  merged.  In  this  merging 
process,  we  combine  communities  C,  and  Q  if  OS(C,,  Q)  >  (3  (Figure  3-2B). 


Algorithm  7  Combining  local  communities 
Input:  Raw  community  structure  Cr 
Output:  A  refined  community  structure  V. 

1 !  T)  i —  Cr  ] 

2:  Done  i—  false ; 

3:  while  (I Done)  do 
4:  Done  •(—  true ; 

5:  Order  (Q,  Q)’s  by  their  OS(Q,  Q)  scores; 

6:  for  (Q,  Q  e  Cr)  do 

7:  if  (OS(Q,  Cj)  >  /3  and  )  then 

8:  C  •<—  Combine  C,  and  Q; 

9:  /*Update  the  current  structure*/ 

1 0:  TD  i —  (Cf\{Cj  U  Cy})  U  C ] 

11:  Done  False] 

12:  end  if 

13:  end  for 

14:  end  while 
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The  time  complexity  of  Alg.  7  is  0(N$)  where  N0  is  the  number  of  local  communities. 
Clearly,  N0  <  M  and  thus,  it  can  be  0(M2).  However,  when  the  intersection  of 
two  communities  is  upper  bounded,  by  Lemma  9  we  know  that  the  number  of  local 
communities  is  also  upper  bounded  by  O(N),  and  thus,  the  time  complexity  of  Alg.  7 
is  0(A/2).  In  our  experiments,  we  observe  that  the  running  time  of  this  procedure  is, 
indeed,  much  less  than  0(A/2). 

Lemma  9.  The  number  of  raw  communities  detected  in  Alg.  6  is  0(N )  when  the  number 
of  nodes  in  the  intersection  of  any  two  communities  is  upper  bounded  by  a  constant  a. 

Proof.  For  each  Q  e  C,  decompose  it  into  overlapped  and  non-overlapped  parts, 
denoted  by  C°v  and  C"ov.  We  have  Q  =  C°v  u  C-ov  and  C°v  n  C"ov  =  0.  Therefore, 

|C/|  =  \C°V\  +  \C™V\. 

Now, 

£|C,|  =  ^2(\C°V\  +  \cr\)  <  /V  +  ^2\C°vn  qm\, 

C,EC  Cj  GC  i  < j 

where  N  =  ]Cc.eC  |  C/nov' |  +  |  \J c.eC  \  C°v \  \ .  For  an  upper  bound  of  the  second  term,  rewrite 

Cjnov I  <  N  +  I Q  n  Cj\  <  N(1  +  a), 

i<j  ]c;ncj|>2 

where  a  =  max{|C(  n  Q|  :  |  Q  n  Q  I  >2} 

Hence,  Ec,eclC'l  <  N(2  +  a).  Let  Nq  be  the  number  of  raw  communities,  it 

follows  that  A/0  min{|C;|}  <  Ec,-ec  IC-|  <  (2  +  a)N.  Since  min{| C,|}  >  4,  we  have 

N0  <  =  O(N).  □ 

Remark 

After  the  above  community  merging  process,  detected  communities  can  possibly  be 
of  very  large  sizes.  The  explanation  is  as  follow:  small  quasi-cliques  are  discovered  in 
the  first  phase  (Alg.  6)  as  densely  connected  parts  of  the  network,  and  are  regarded  as 
candidate  elements  for  bigger  communities  in  the  merging  process.  If  these  small 
cliques  are  loosely  connected  to  the  rest  of  the  network,  they  will  retain  as  local 
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communities  afterwards.  Otherwise,  they  can  be  merged  to  other  dense  parts  to 
become  new  bigger  communities.  As  a  result,  if  the  communities  are  highly  overlapped, 
some  of  them  can  potentially  grow  to  very  large  sizes  at  the  end  of  the  merging  process, 
beside  the  small  cliques  detected  at  the  first  place.  Larger  dense  quasi-cliques,  though 
rare  in  many  networks,  will  surely  be  detected  by  FOCS  as  we  observed  in  Theorem  3.1. 
3.2.3  Revisiting  unassigned  nodes 

Even  when  the  above  two  procedures  are  executed,  there  would  still  exist  leftover 
nodes  or  edges  due  to  their  less  attraction  to  the  rest  of  the  network.  Because  of  its 
size  constraint,  the  first  procedure  skips  over  tiny  communities  of  sizes  less  than  four 
and  thus,  may  leave  out  some  nodes  unlabeled.  These  nodes  will  not  be  touched  in  the 
second  phase  since  they  do  not  belong  to  any  local  communities,  and  consequently,  will 
remain  unassigned  afterwards.  Moreover,  they  are  mostly  nodes  with  less  connection 
to  the  rest  of  the  network,  and  thus,  are  very  likely  supplement  nodes  possibly  to  their 
adjacent  communities.  Therefore,  we  need  to  revisit  those  nodes  to  either  group  them 
into  appropriate  communities  or  classify  them  as  outliers  based  on  their  connectivity 
structures. 


Algorithm  8  Revisit  Unassigned  Nodes 

Input:  The  refined  community  structure  V  =  {Dlt  D2, . 

;Dt} 

Output:  The  basic  community  structure  C  =  {Clt  C2, .. 

,  ck} 

1 

C  <-  V\ 

2 

for  ( u  e  V  and  Com(u)  ==  0)  do 

3 

NC(u)  <—  {Q  &  C\u  is  adjacent  to  Q}; 

4 

for  (Cj  e  NC(u ))  do 

5 

if  (Fqum  >  F Cj)  then 

6 

Cj  <—  Cj  U  {u}] 

7 

Com(u)  <—  Com(u)  U  {_/}; 

8 

end  if 

9 

end  for 

10 

if  ( Com(u )  ==  0)  then 

11 

Classify  u  as  an  outlier; 

12 

end  if 

13 

end  for 
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Alternatively,  this  process  can  be  thought  of  as  a  community  trying  to  hire  adjacent 
unassigned  nodes  which  are  similar  to  the  host  community.  However,  the  internal 
density  function  might  be  too  strict  for  them  to  be  included  in  any  community  (which 
was  also  the  reason  why  they  are  left  unassigned).  To  this  end,  we  need  a  community 
fitness  function  in  order  to  quantify  the  similarity  between  a  node  u  and  a  neighbor 
community  C.  We  find  the  fitness  function  Fs  =  (where  S  c  V)  commonly 

used  in  [56][39][63]  performs  competitively  in  both  synthesized  and  real-world  datasets. 
Taking  into  account  this  fitness  function,  a  community  C  will  keep  hiring  any  unassigned 
adjacent  vertex  of  maximum  similarity  in  a  greedy  manner,  provided  the  newly  joined 
vertex  does  not  shrink  down  the  community’s  current  fitness  value.  If  there  is  no  such 
node,  C  is  defined  as  a  final  network  community.  Nodes  remained  unlabeled  through 
this  last  procedure  are  identified  as  outliers.  This  algorithm  is  presented  in  Alg.  8. 

3.3  Detecting  Evolving  Network  Communities 

We  describe  AFOCS,  the  second  phase  and  also  the  main  focus  of  our  detection 
framework.  In  particular,  we  use  AFOCS  to  adaptively  update  and  trace  the  network 
communities,  which  were  previously  initialized  by  FOCS,  as  the  dynamic  network 
evolves  over  time.  Note  that  FOCS  is  executed  only  once  on  G0,  after  that  AFOCS  will 
take  over  and  handle  all  changes  introduced  to  the  network. 

Let  us  first  discuss  the  various  behaviors  of  the  community  structure  when  the 
network  topology  evolves  over  time.  Suppose  G  =  (V,  E)  and  C  =  {Q,  C2, Cn}  is  the 
current  network  and  its  corresponding  overlapping  community  structure,  respectively. 
We  use  the  term  intra  links  to  denote  edges  whose  two  endpoints  belong  to  the  same 
community,  inter  links  to  denote  those  with  endpoints  connecting  different  disjoint 
communities  and  the  term  hybrid  links  to  stand  for  the  others.  For  each  community  C  of 
G,  the  number  of  connections  joining  C  with  the  others  are  lesser  than  the  number  of 
connections  within  C  itself  by  definition 
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Intuitively,  the  addition  of  intra  links  or  removal  of  inter  links  between  communities 
of  G  will  strengthen  them  and  consequently,  will  make  the  structure  of  G  more  clear. 
Similarly,  removing  intra  links  from  or  introducing  inter  links  to  a  community  of  G  will 
decrease  its  internal  density  and  as  a  result,  loosen  its  internal  structure.  However, 
when  two  communities  have  less  distraction  to  each  other,  adding  or  removing  links 
makes  them  more  attractive  to  each  other  and  therefore,  leaves  a  possibility  that  they 
can  overlap  with  each  other  or  can  be  combined  to  form  a  new  community.  The  updating 
process,  as  a  result,  is  very  complicated  and  challenging  since  any  insignificant  change 
in  the  network  topology  could  possibly  lead  to  an  unpredictable  transformation  of  the 
network  community  structure. 

In  order  to  reflect  these  changes  to  a  complex  network,  its  underlying  graph  model 
is  frequently  updated  by  either  inserting  or  removing  a  node  or  a  set  of  nodes,  or  an 
edge  or  a  set  of  edges.  A  scrutiny  look  into  these  events  reveals  that  the  introduction  or 
removal  of  a  set  of  nodes  (or  edges)  can  furthermore  be  decomposed  as  a  collection 
of  node  (or  edge)  insertions  (or  removals),  in  which  only  a  node  (or  only  an  edge)  is 
inserted  (or  removed)  at  a  time.  Therefore,  changes  to  the  network  at  each  time  step 
can  be  viewed  as  a  collection  of  simpler  events  whose  details  are  as  follow: 

•  newNode  (V  +  u):  A  new  node  u  and  its  adjacent  edge(s)  are  introduced 

•  removeNode  (V  -  u):  A  node  u  and  its  adjacent  edge(s)  are  removed  from  the 
network. 

•  newEdge  (E  +  e):  A  new  edge  e  connecting  two  existing  nodes  is  introduced. 

•  removeEdge  (E  -  e):  An  edge  e  in  the  network  is  removed. 

As  we  mentioned  earlier,  our  adaptive  framework  initially  requires  a  basic 
community  structure  C0.  To  obtain  this  basic  structure,  we  apply  FOCS  algorithm  at 
the  first  network  snapshot,  i.e.,  we  execute  FOCS  on  the  network  G0  and  then  let 
AFOCS  adaptively  handle  this  structure  as  the  network  evolves. 
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Figure  3-3.  A  possible  scenario  when  a  new  node  is  introduced. 

3.3.1  Handling  a  new  node 

Let  us  discuss  the  first  case  when  a  new  node  u  and  its  associated  links  are 
introduced  to  the  network.  Possibilities  are  (1)  u  may  come  with  no  adjacent  edge  or  (2) 
with  many  of  them  connecting  one  or  more  possibly  overlapped  communities.  If  u  has  no 
adjacent  edge,  we  simply  join  u  in  the  set  of  outliers  and  preserve  the  current  community 
structure. 

The  interesting  case  happens,  and  it  usually  does,  when  u  comes  with  multiple  links 
connecting  one  ore  more  existing  communities.  Since  network  communities  can  overlap 
each  other,  we  need  to  determine  which  ones  u  should  join  in  in  order  to  maximize  the 
gained  internal  density.  But  how  can  we  quickly  and  effectively  do  so?  By  Lemma  10, 
we  give  a  necessary  condition  for  a  new  node  in  order  to  join  in  an  existing  community, 
i.e.,  our  algorithm  will  join  node  u  in  C  if  the  number  of  connections  u  has  to  C  suffices: 
dui  >  max{|^j,  f(|  C,  |  +  1)  -  |  C/n | } .  However,  failing  to  satisfy  this  condition  does  not 
necessarily  imply  that  u  should  not  belong  to  C,  since  it  can  potentially  gather  some 
substructure  of  C  to  form  a  new  community  (Figure  3-3).  Thus,  we  also  need  to  handle 
this  possibility.  Alg.  9  presents  the  algorithm. 

Lemma  10.  Suppose  u  is  a  newly  introduced  node  with  dui  connections  to  each 

2\Cm\ 

adjacent  community  Q.  u  will  join  in  Q  if  dui  >  f(K'l  + !)  -  IQnl}- 
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Algorithm  9  Handling  a  new  node  u 
Input:  The  current  community  structure  Ct_ i 
Output:  An  updated  structure  Ct. 

1 :  Ci ,  C2 ,  . . . ,  Cfc  4—  Adjacent  communities  of  u\ 

2:  for  i  =  1  do  to  k 

3:  if  (dui  >  rnax{^g,  f(|C,|  +  1)  -  \qn\})  then 

4 !  Cj  i —  Cj  U  {  u } , 

5:  else 

6 :  C  i —  N(u)  Pi  Cj', 

7:  if  (vJ/(C)  >  r(C)  and  |C|  >  4)  then 

8 !  Cj  i —  Cj  U  {  u  } , 

9:  end  if 

10:  end  if 

11:  end  for 

12:  /‘Checking  new  communities  formed  from  outliers*/ 
13:  for  ( v  6  N(u)  and  Com(v)  n  Com(u)  =  0)  do 
14:  C  =  N(u)  n  N(v)] 

15:  if  (vU(C)  >  r(C)  and  | C|  >  4)  then 

16:  Define  C  a  new  community; 

17:  end  if 

18:  end  for 

19:  Merging  overlapping  communities  on  Ci,  C2, Ck] 
20:  Update  Ct\ 


2\C‘n\ 

Proof.  Prior  to  u  joining  to  Q,  the  internal  density  is  v|/(c,-)  =  |c|('|cl|_1).  Similarly,  after 
u  joining  in  CM  the  density  function  is  v|/(c,-  u  {t/})  =  |(L.|l(||C|+1uj.  Taking  the  difference 
between  these  two  quantities  gives  v|/(c,  u  {u})  >  v|/(c,-)  4=^  clui  >  Moreover,  u 
should  also  satisfy  v|/(C/U{u})  >  r(C,u{t/}),  which  in  turn  implies  duj  >  f(|C/|  +  l)-|C/n|. 
Therefore,  dui  >  max{ ,  f(|C,|  +  1)  —  |Cj,n|}.  □ 

The  analysis  of  Alg.  9  is  shown  by  Lemma  1 1 .  In  particular,  we  show  that  this 
procedure  achieves  at  least  74%  the  internal  density  of  the  optimal  assignment  for  u, 
given  the  prior  community  structure. 

Lemma  1 1 .  Alg.  9  produces  a  community  assignment  that,  prior  to  the  community 
combination  process,  achieves  v|/(Ct)  >  r(4)  x  C(OPT(u)t)  where  OPT(u)t  is  the 
optimal  community  assignment  for  u  at  time  t,  given  the  prior  community  structure  Ct_  1. 
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(a) 


(b) 


Figure  3-4.  Possible  scenarios  when  a  new  edge  is  introduced. 

Proof.  Let  Ci,  C2 _ Ck  be  the  communities  (including  the  newly  formed  ones)  in  Ct  that 

Alg.  9  assigns  the  new  node  u  to.  Note  that  in  the  optimal  solution  OPT(u)t,  the  number 
of  communities  u  belongs  to  should  not  exceed  k  since  each  Q  is  also  a  candidate  for 
OPT(u)t  (of  course,  OPT(u)t  could  possibly  rearrange  nodes  differently).  Therefore, 
the  optimal  internal  density  gained  is  upper  bounded  by  k.  On  the  other  hand,  Alg.  9 
makes  sure  that  each  community  C,  that  u  joins  in  should  have  vU(C,)  >  r(  Ci)  >  r(4) 
since  |C,|  >  4.  Thus,  Alg.  9  will  achieve  at  least  r(4)  x  k  k,  0.74  x  v| \f(OPT(u)t).  □ 

3.3.2  Handling  a  new  edge 

In  case  where  a  new  edge  e  =  (u,  v)  connecting  two  existing  vertices  u  and  v 
is  introduced,  we  divide  it  further  into  two  four  smaller  cases:  (1 )  e  is  solely  inside  a 
single  community  C  (2)  e  is  within  the  intersection  of  two  (or  more)  communities  (3)  e  is 
joining  two  separated  communities  and  (4)  e  is  crossing  overlapped  communities.  If  e 
is  totally  inside  a  community  C,  its  presence  will  strengthen  C’s  internal  density  and  by 
Lemma  12,  we  know  that  adding  e  should  not  split  the  current  community  C  into  smaller 
substructures. 

In  the  second  subcase,  the  introduction  of  the  new  edge  might  increase  the  density 
of  some  part  of  C  and  it  is  reasonable  to  think  of  that  part  (say  D)  as  a  new  separated 
community.  However,  since  D  originally  shared  a  significant  substructure  with  C,  the 
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merging  process  will  then  combine  C  and  D  (if  they  were  separated)  to  be  a  bigger 
community,  thus  raising  the  same  community  as  if  C  was  kept  intact.  Therefore,  the 
same  reaction  applies  in  the  second  subcase  when  e  is  within  the  intersection  of  two 
communities  since  their  inner  densities  are  both  increased.  Thus,  in  these  first  two 
cases,  we  leave  the  current  network  structure  intact. 


Algorithm  10  Handling  a  new  edge  ( u ,  v) 

Input:  The  current  community  structure  Ct_i. 

Output:  An  updated  community  structure  Ct. 

1:  if  {(u,  i/)ea  single  community  OR  (u,  v)  e  Cu  n  Cv)  then 

2:  Ct  Ct_ i ; 

3:  else  if  (Com(u)  n  Com(v)  ==  0)  then 
4:  C  «-  N(u)  n  /V(v); 

5:  if  (r|/(C)  >  r(C))  then 

6:  Define  C  a  new  community; 

7:  Check  for  combining  on  Com(u),  Com(v)  and  C; 

8:  else 

9:  for  (D  e  Com(u)  (or  D*  e  Com(v )))  do 

10:  if  (XI /(D  U  {v})  >  t(D  U  {v}))  (or  v|/(D  *  U{t/})  >  r(D  *  U{v}))  then 

11:  D<-DU  {v}  (or  D*  <—  D  *  U{t/}) 

12:  end  if 

13:  end  for 

14:  Merging  overlapping  communities  for  D’s  (or  D*); 

15:  end  if 

16:  Update  Ct\ 

17:  end  if 


Handling  the  last  two  subcases  is  complicated  since  any  of  them  can  either  have  no 
effect  on  the  current  network  structure  or  unpredictably  form  a  new  network  community, 
and  furthermore  can  overlap  or  merge  with  the  others  (Figure  3-4).  However,  there  is 
still  a  possibility  that  the  introduction  of  this  new  link,  together  with  some  substructure 
of  Cu  or  C„,  suffices  to  form  a  new  community  that  can  overlap  with  not  only  Cu  and  Cv 
but  also  with  some  of  the  others.  The  other  subcases  can  be  handled  similarly.  Alg.  10 
describe  this  procedure. 

Lemma  1 2.  If  an  new  edge  (u,  v)  is  introduced  solely  inside  a  community  C,  it  should 
not  split  C  into  smaller  substructures. 
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Figure  3-5.  Possible  scenarios  when  an  existing  node  is  removed. 


Proof.  Suppose  otherwise,  that  is  C  is  divided  into  smaller  parts  Q  and  C2.  Prior  to  the 
introduction  of  (u,  v),  we  have  x|/(C)  =  x|/(Ci  u  C2)  >  r(C)  =  r(Ci  u  C2).  Now,  when  Q 
and  C2  are  formed,  they  imply  that  x|/(Ci  u  C2  +  ( u ,  v))  <  r(Ci  u  C2  +  (u,  v )).  Putting  all 
together,  we  have  t(Q  u  C2  +  (t/,  v))  =  r(C i  u  C2)  >  Ur(Ci  u  C2  +  (t/,  v))  >  v|/(C)  > 
r(Ci  u  C2),  which  raises  a  contradiction.  Thus,  the  conclusion  follows.  □ 

3.3.3  Removing  an  existing  node 

When  an  existing  node  u  is  about  to  be  removed  from  the  network,  all  of  its  adjacent 
edges  will  also  be  removed  as  a  consequence.  If  u  is  an  outlier,  we  can  simply  exclude 
u  and  its  corresponding  links  from  the  current  structure  and  safely  keep  the  network 
communities  unchanged. 

In  unfortunate  situations  where  u  is  not  an  outlier,  the  problem  becomes  very 
challenging  in  the  sense  that  the  resulting  community  is  complicated:  it  can  either  be 
unchanged,  or  broken  into  smaller  communities,  or  could  probably  be  further  merged 
with  the  other  communities.  To  give  a  sense  of  this  effect,  let’s  consider  two  examples 
illustrated  in  Figure  3-5.  In  the  first  example,  when  C  is  almost  a  full  clique,  the  removal 
of  any  node  will  not  break  it  apart.  However,  if  we  a  remove  node  that  tends  to  connect 
the  others  within  a  community,  the  leftover  module  is  broken  into  a  smaller  one  together 
with  a  node  that  will  later  be  merged  to  one  of  its  nearby  communities.  Therefore, 
identifying  the  leftover  structure  of  C  is  a  crucial  task  once  a  vertex  u  in  C  is  removed. 
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To  quickly  handle  this  task,  we  first  examine  the  internal  density  of  C  excluding  the 
removed  node  u.  If  the  number  of  internal  connections  still  suffices,  e.g.,  v|/(C\{t/})  > 
t(C\{u}),  we  can  safely  keep  the  current  network  community  structure  intact  because  C 
is  still  tightly  connected  itself  with  a  sufficient  internal  density.  Otherwise,  this  community 
is  of  a  weak  strength  and  shall  be  broken  into  smaller  ones.  These  substructures 
might  further  be  merged  with  other  communities  if  C  origianlly  overlaps  with  them.  To 
efficiently  detect  these  new  substructures,  we  apply  Alg.  6  on  the  subgraph  induced  by 
C\{u}  to  quickly  identify  the  leftover  modules  in  C,  and  then  let  these  modules  hire  a  set 
of  unassigned  nodes  x|/(c)  that  help  them  increasing  their  inner  densities.  Finally,  we 
locally  check  for  community  combination,  if  any,  by  using  an  algorithm  similar  to  Alg.  7. 
Alg.  11  presents  the  procedure. 


Algorithm  11  Removing  a  node  u 
Input:  The  current  community  structure  Ct_x. 

Output:  An  updated  structure  Ct. 

1:  for  (C  e  Com(u)  and  v|/(c\{ty})  <  t(C\{u}))  do 
2:  LC  4-  Local  communities  by  Alg  6  on  C\{u}m, 

3:  for  (C,  g  LC  and  |C,|  >  4)  do 

4:  S,  4—  Nodes  such  that  4/(Cf-  u S,)  >  r(C,  US,); 

5 !  Cj  4 —  Cj  U  S-, , 

6:  end  for 

7:  Merging  overlapping  communities  on  LC ; 

8:  end  for 

9:  Update  Ct\ 


3.3.4  Removing  an  edge 

In  the  last  situation  when  an  edge  e  =  (u,  v)  is  about  to  be  removed,  we  divide  it 
further  into  four  subcases  similar  to  those  of  a  new  edge  (1)  e  is  between  two  disjoint 
communities  (2)  e  is  inside  a  sole  community  (3)  e  is  within  the  intersection  of  two  (or 
more)  communities  and  finally  (4)  e  is  crossing  overlapping  communities. 

In  the  first  subcase,  when  e  is  crossing  two  disjoint  communities,  its  removal  will 
make  the  network  structure  more  clear  (since  we  now  have  less  connections  between 
groups),  and  thus,  the  current  communities  should  be  keep  unchanged.  When  e  is 
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(a) 


(b) 


Figure  3-6.  Possible  scenarios  when  an  existing  edge  is  removed. 

totally  within  a  sole  community  C,  handling  its  removal  is  complicated  since  this  can  lead 
to  an  unpredictable  transformation  of  the  host  module:  C  could  either  be  unchanged  or 
broken  into  smaller  modules  if  it  contains  substructures  which  are  less  attractive  to  each 
other,  as  depicted  in  Figure  3-6.  Therefore,  the  problem  of  identify  the  structure  of  the 
remaining  module  becomes  the  central  part  for  not  only  this  case  but  also  for  the  others. 


Algorithm  12  Removing  an  edge  (u,  v) 

Input:  The  current  structure  Ct-\. 

Output:  An  updated  community  structure  Ct. 

1:  if  ((ty,  v)  is  an  isolated  edge  )  then 

2:  Ct  =  (Ct_i\{r/,  v})  U  {u}  U  {v}; 

3:  else  if  (du  =  1  (or  dv  =  1))  then 
4:  Ct  =  (Ct-i\C(u))  U  {u}  U  C(v); 

5:  else  if  (C  =  C(u)  n  C(v)  =  0)  then 
6:  Ct  =  Ct— il 

7:  else  if  (v|/(C\(i/,  v))  <  r(C\(u,  v)))  then  /*Here  C  /  07 
8:  LC  <-  Local  communities  by  Alg  6  on  C\(u,  v); 

9:  Define  each  L  e  LC  a  local  community  of  Ct-C, 

10:  Merging  overlapping  community  on  L’ s; 

ii:  end  if 

12:  Update  Ct; 


To  quickly  handle  these  tasks,  we  first  verify  the  inner  density  of  the  remaining 
module  and,  again  utilize  the  local  community  location  method  (Alg.  6)  to  locally 
identify  the  leftover  substructures.  Next,  we  check  for  community  combination  since 
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these  structures  can  possibly  overlap  with  existing  network  communities.  The  detailed 
procedure  is  described  in  Alg.  12. 

3.3.5  Remarks 

Note  that  the  ultimate  goal  of  our  framework  is  to  adaptively  detect  and  update  the 
community  structure  as  the  network  evolves,  i.e. ,  to  mainly  deal  with  the  dynamics  of  a 
mobile  network.  As  a  result,  we  mainly  put  our  focus  on  AFOCS.  Although  FOCS,  the 
first  detection  phase,  appears  to  be  a  centralized  algorithm,  it  is  executed  only  once 
at  the  very  first  network  snapshot  whereas  AFOCS  stays  up  and  locally  handles  all 
changes  as  the  network  evolves  over  time.  That  said,  we  do  not  execute  FOCS  again. 
Furthermore,  AFOCS  can  be  run  independently  with  FOCS,  i.e.,  one  can  use  any 
localized  detection  algorithm  to  identify  a  basic  community  structure  at  the  first  phase. 
Thus,  AFOCS  can  be  easily  apply  to  solve  mobile  network  problems. 

3.3.6  Complexity 

Our  main  algorithm  consists  of  two  parts:  (1)  finding  the  basic  community  structure 
and  (2)  updating  the  network  community  structure  through  changes  introduced  at  every 
time  step.  The  complexity  of  quickly  unfolding  the  basic  network  community  structures 
has  been  claimed  to  be  linear  in  terms  of  number  of  nodes  and  links  0(M  +  A/)  [58].  To 
handle  the  case  of  a  new  node  of  degree  p  coming  in,  our  algorithm  computes  p  forces 
this  new  node  applies  to  its  neighbors,  which  results  in  linear  time  complexity  O(p). 
When  a  new  edge  connecting  nodes  u  and  v  is  introduced  to  the  network,  our  algorithm 
just  simply  computes  the  forces  applied  to  communities  adjacency  nodes,  which  takes 
0(\C(u)\  +  | C(v)|)  in  the  best  case  and  0(k  x  max{|C(t/)|,  |C(v)|})  in  the  worst  case 
when  some  nodes  in  a  module  are  pulled  out  to  form  new  communities  (where  k  is  the 
number  of  communities  in  G).  The  time  taken  to  handle  the  last  two  cases  is  essentially 
the  time  complexity  of  the  clique  percolation,  which  is  roughly  0(|C(r/)|3)  in  the  worst 
case.  Although  the  time  complexity  is  in  the  third  order  of  number  of  nodes,  the  total 
nodes  inside  a  single  community  is  relatively  small  in  comparison  with  the  total  number 
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Figure  3-7.  NMI  scores  for  different  values  of  (3.  N  =  5000  (top),  N  =  1000  (bottom), 
/i  =  0.1  (left),  /d  =  0.3  (right). 


of  vertices  N,  and  thus,  does  not  affect  the  actual  running  time.  Experimental  results  in 
Section  4  show  that  our  algorithm  performs  quickly  and  smoothly  in  large  social  online 
networks. 

3.4  Experimental  Results 

In  this  section,  we  first  present  the  empirical  results  of  AFOCS  in  comaprison  with 
two  static  detection  methods:  CFinder  -  the  most  popular  method  [84],  and  COPRA  - 
the  most  effective  method  [40].  We  next  compare  the  performance  of  AFOCS  with  other 
dynamic  methods  including  OSLOM  [60],  FacetNet  [71]  and  iLCD  [9]. 

Data  Sets:  We  use  networks  generated  by  the  well-known  LFR  overlapping 
benchmark  [58],  the  ‘de  facto’  standard  for  evaluating  overlapping  community  detection 
algorithms.  Generated  networks  follow  power-law  degree  distributions  and  contain 
embedded  overlapping  communities  (the  ground  truth)  of  varying  sizes  that  capture  the 
internal  characteristics  of  real-world  networks. 
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Figure  3-8.  Comparison  among  AFOCS,  COPRA  and  CFinder  methods.  N  = 
N  =  1000  (bottom),  ji  =  0.1  (left),  /i  =  0.3  (right). 
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Set  up:  To  fairly  compare  with  COPRA  and  to  avoid  being  biased,  we  keep  the 
parameters  close  to  [40]:  the  minimum  and  maximum  community  sizes  are  cmin  =  10 
and  cmax  =  50,  each  vertex  belongs  to  at  most  two  communities,  om  =  2.  N  =  1000 
and  N  =  5000.  The  mixing  rate  //  =  0.1  and  /i  =  0.3.  The  overlapping  fraction  7, 
which  determines  the  fraction  of  overlapped  nodes,  is  from  0  to  0.5.  Since  COPRA  is 
nondeterministic,  we  run  it  10  times  on  each  instance  and  select  the  best  result. 

Metrics:  We  evaluate  following  metrics. 

(1)  The  generalized  Normalized  Mutual  Information  (NMI)  [56]  specially  built  for 
overlapping  communities.  NMI  scores  the  similarity  between  the  detected  network 
communities  and  the  ground  truth.  This  is  an  standardized  measure  since  NMI(U,V)=1  if 
structures  U  and  V  are  identical  and  0  if  they  are  totally  separated. 

(2)  The  number  of  communities,  ignoring  singleton  communities  and  unassigned 
nodes.  A  good  community  detection  method  should  produce  roughly  the  same  number 
of  communities  with  the  known  ground  truth. 

3.4.1  Choosing  the  overlapping  threshold  /3 

The  overlapping  threshold  (3  is  the  only  input  parameter  required  by  our  framework, 
and  thus,  determining  its  appropriate  value  plays  an  important  role  in  assessing 
AFOCS’s  performance.  To  best  determine  this  threshold,  we  run  AFOCS  on  generated 
networks  with  different  values  of  (3,  and  record  the  similarities  between  the  detected 
communities  and  the  ground-truth  via  NMI  scores  (Figure  3-7).  Of  course,  the  higher 
NMI  scores  imply  the  better  /3  values. 

As  a  threshold  parameter,  /3  controls  how  much  substructure  communities  can  have 
in  common.  The  smaller  values  of  (3  imply  the  more  we  allow  network  communities  to 
overlap  with  each  other,  and  vice  versa.  Similarly,  [3  can  be  thought  of  as  the  zooming 
scale  of  the  network  structure  where  lower  (3' s  reveal  the  coarser  and  higher  (3’ s  reveal 
the  finer  structure.  As  depicted  in  Figure  3-7,  the  best  values  for  (3  are  ranging  from  0.67 
to  0.80,  among  which  /3  =  0.70  yields  the  best  community  similarity  (NMI  scores  are 


73 


ranging  from  0.8  to  1)  in  all  of  the  generated  networks.  Therefore,  we  fix  the  overlapping 
threshold  in  AFOCS  to  be  0.70  hereafter. 

3.4.2  Reference  to  static  methods 

We  show  our  results  in  groups  of  four.  For  each  case  we  vary  the  overlapping 
fraction  7  from  0  to  0.5  and  analyze  the  results  found  by  AFOCS,  CFinder,  COPRA 
and  (static)  OSLOM  methods  (OSLOMs).  We  only  present  results  when  corresponding 
parameters  give  top  performance  for  CFinder  (clique  size  k  =  4,5)  and  COPRA  (max. 
communities  per  vertex  v  =  3, 6). 

Figure  3-8A  shows  the  number  of  communities  found  by  AFOCS,  COPRA  and 
CFinder,  OSLOMs  and  the  ground  truth.  It  reveals  from  this  figure  that  the  numbers 
of  communities  found  by  AFOCS,  marked  with  squares,  are  the  closest  and  almost 
identical  to  the  ground  truth  as  the  overlapping  fraction  gets  higher.  There  is  an 
exception  when  N  =  1000  and  n  =  0.3  which  we  will  discuss  later.  In  terms 
of  NMI  scores,  as  one  can  infer  from  Figure  3-8B,  AFOCS  achieves  the  highest 
performance  among  all  methods  with  much  more  stable.  A  common  trend  in  this 
test  is  the  performances  of  all  methods  degrade  (1)  when  the  mixing  rate  //  increases, 
i.e.,  when  the  community  structure  becomes  more  ambiguous  or  (2)  when  the  size  of 
network  decreases  while  the  mixing  rate  /i  stays  the  same.  Even  though  AFOCS  is  not 
very  competitive  only  when  both  negative  factors  happen  in  the  bottom-right  char  as 
N  =  1000  and  =  0.3,  it  is  in  general  the  best  performer.  OSLOMs,  the  static  version 
of  OSLOM  method,  does  not  appear  to  perform  well  on  these  synthesized  data  as  its 
NMI  scores  are  low  and  degrade  quickly  when  the  network  communities  become  more 
stochastic.  The  NMI  scores  of  AFOCS,  on  the  other  hand,  remain  high  and  stable  even 
when  the  network  community  structure  becomes  unclear  when  the  overlapping  fraction 
increases. 

The  significant  gap  is  observed  when  the  mixing  rate  gets  higher  (/1  =  0.3)  and  the 
network  size  gets  smaller  (A/  =  1000).  AFOCS  provides  less  numbers  of  communities 
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than  those  of  the  ground  truth  but  with  much  higher  overlapping  rates.  The  reason  is 
with  a  larger  mixing  rate  //,  a  node  will  have  more  edges  connecting  vertices  in  other 
communities,  thus  increases  the  chance  that  AFOCS  will  merge  highly  overlapped 
communities.  Hence,  AFOCS  creates  less  but  with  larger  size  communities.  We  note 
that  this  ‘weakness’  of  AFOCS  is  controversial  as  when  the  mixing  rate  increases,  the 
ground  truth  does  not  necessarily  coincide  with  the  structure  implied  by  the  network’s 
topology.  Extensive  experiments  show  the  ability  of  AFOCS  in  identifying  high  quality 
overlapping  communities.  In  addition,  we  found  AFOCS  runs  substantially  faster  than 
the  other  competitors:  on  the  Facebook  regional  network  [100]  containing  63K  nodes, 
AFOCS  is  150x  faster  than  COPRA  while  CFinder  is  unable  to  finish  its  tasks. 

3.4.3  Reference  to  other  dynamic  methods 

We  next  observe  the  performance  of  AFOCS  in  reference  to  two  dynamic  methods 
FacetNet,  iLCD  and  OSLOM.  Since  the  ground-truth  communities  are  known  on 
synthesized  datasets,  fair  comparisons  among  three  methods  can  be  obtained  via 
their  NMI  scores  and  running  times.  Of  course,  the  higher  its  NMI  scores  with  less  time 
consuming,  the  better  the  method  seems  to  be. 

Each  synthesized  dynamic  network  is  simulated  via  5  snapshots,  in  which  the  basic 
communities  are  formed  by  using  50%  of  the  network  data  with  approximately  10% 
of  the  network  evolution  (node/edge  additions  and  removals)  added  to  each  growing 
snapshot  at  a  time.  Since  FacetNet  requires  the  number  of  communities  a  priori,  we 
input  this  method  the  actual  number  as  mined  form  the  ground-truth.  For  iLCD  and 
OSLOM  methods,  we  keep  the  default  setting  as  provided  in  their  deliverable. 

We  first  evaluate  the  objective  function,  i.e.,  the  total  internal  density  obtained  by 
all  methods  in  Figure  3-9A.  Although  internal  density  is  not  necessarily  the  objective 
of  other  methods,  this  metric  can  provide  us  the  concept  of  how  strong  the  community 
structure  detected  by  each  approach  is.  As  revealed  Figure  3-9A,  AFOCS  obtained  the 
highest  internal  density  in  all  tests  and  is  only  lagged  behind  iLCD  approach. 
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B  NMI  scores 


Figure  3-10.  The  number  of  communities  obtained  by  AFOCS,  iLCD,  FacetNet  and 
OSLOM  and  OSLOMs  methods. 

The  NMI  scores  of  four  methods  are  presented  in  Figure  3-9B  and  3-10.  It  reveals 
from  these  subfigures  that  the  NMI  scores  of  AFOCS  are  higher  than  those  of  FacetNet, 
iLCD  and  OSLOM.  In  particular,  the  NMI  scores  of  AFOCS  are  about  just  5-7%  lag 
behind  that  of  OSLOM  and  iLCD  in  the  first  2  network  snapshots,  while  are  much  better 
than  the  others  at  the  end  of  the  evolution.  OSLOM’s  NMI  values  are  very  high  at 
the  very  beginning,  however,  they  tend  to  decrease  quickly  as  more  connections  and 
nodes  are  introduced.  The  NMI  scores  of  iLCD  and  FacetNet  tend  to  fluctuate  and  also 
decrease  significantly  at  the  last  snapshot.  AFOCS,  in  the  other  trend,  keeps  its  NMI 
scores  high  and  wealthy,  especially  at  the  end  of  the  network  evolution.  This  implies 
communities  discovered  by  AFOCS  are  of  higher  similarity  to  the  ground-truth  than 
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the  other  dynamic  methods,  especially  in  the  long  run.  The  number  of  communities 
found  by  all  methods  are  reported  in  Figure  3-10.  Of  course,  the  closer  these  detected 
numbers  of  communities  to  the  ground-truth,  the  better  the  method  are  believed  to  be. 
As  revealed  in  the  subfigures  of  Figure  3-10,  these  quantities  discovered  by  AFOCS 
tend  to  closely  approach  the  actually  numbers,  even  when  the  mixing  rates  are  high 
(right  figures).  The  highest  similarity  between  these  numbers  of  communities  is  possibly 
the  best  explanation  for  the  high  NMI  scores  of  AFOCS  over  the  other  competitors. 

We  next  take  a  look  at  the  running  time  of  all  methods  in  these  synthesized 
networks.  AFOCS  requires  at  most  5  seconds  to  finish  updating  each  network  snapshot 
whereas  FacetNet  asks  for  more  than  25  seconds  (5x  more  time  consuming)  in  the 
networks  with  just  5000  nodes.  iLCD  and  OSLOM  also  perform  fast  in  these  generated 
datasets;  however,  the  similarity  of  the  detected  communities  and  the  ground-truth 
is  surprisingly  poor,  as  revealed  from  the  results.  Therefore,  in  terms  of  dynamic 
approaches,  we  strongly  believe  that  AFOCS  achieves  competitive  community  detection 
results  in  a  timely  manner.  These  results  also  provide  us  the  confidence  when  applying 
AFOCS  to  analyze  real-world  networks. 
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CHAPTER  4 

COMMUNITY  STRUCTURE  DETECTION  USING  NONNEGATIVE  MATRIX 

FACTORIZATION 

In  this  chapter,  analyze  two  approaches,  namely  iSNMF  and  iANMF,  for  effectively 
identifying  social  network  communities  using  Nonnegative  Matrix  Factorization  (NMF) 
with  l-divergence  (Kullback-Leibler  divergence)  as  the  cost  function.  Our  approaches 
work  by  iteratively  factorizing  a  nonnegative  input  matrix  through  derived  multiplicative 
update  rules  and  the  Quasi-Newton  method.  By  doing  so,  we  can  not  only  extract 
meaningful  overlapping  communities  via  soft  community  assignments  produced  by 
NMF  but  also  nicely  handle  both  directed  and  undirected  networks  with  or  without 
weights.  We  give  the  complete  multiplicative  update  rules  for  factorizing  X  «  HHT 
(iSNMF  problem)  and  X  «  HSHT  (iANMF  problem)  to  effectively  identify  overlapping 
communities  on  social  networks.  These  approaches  are  topology-independent  and  their 
solutions  can  be  easily  interpreted.  We  provide  in  detail  the  foundation  properties  as 
well  as  the  proofs  of  correctness  and  convergence  of  both  iSNMF  and  iANMF  problems. 
We  also  propose  the  Quasi-Newton  method  to  speed  up  the  performance  of  iSNMF 
update  rule.  Furthermore,  we  validate  the  performance  of  our  approaches  through 
extensive  experiments  on  not  only  synthesized  datasets  but  also  real-world  networks. 
Empirical  results  show  that  iSNMF  is  among  the  best  efficient  detection  methods  on 
undirected  networks  while  iANMF  outperforms  current  available  methods  in  directed 
networks,  especially  in  terms  of  detection  quality. 

4.1  Problem  Definition  and  Properties 
4.1.1  Motivation  for  NMF  in  community  detection 

Let  us  first  get  some  insight  about  how  NMF  can  be  helpful  in  detecting  network 
communities,  especially  overlapping  ones.  Consider  the  toy  network  G  =  (V,  E) 
pictured  in  Figure  4-1.  This  network  contains  clear  communities  C1  and  C2  having  node 
4  in  common.  The  adjacency  matrix  X  of  this  ideal  network  can  be  represented  as 
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X  «  HHT 


5 


2 


7 


Ci 


C2 


1  1  1  .87  0  0  0  0 

0  0  0  .89  0.99  0.99  0.99  0.99 


Figure  4-1.  An  illustrative  example  motivating  NMF  in  community  detection 


/ 


Si  0 


\ 


X 


,  where  Si  and  S2  are  4  x  4  and  5x5  square  matrices  corresponding  to 


Ci  and  C2,  respectively.  This  adjacent  matrix  X  summarizes  all  the  network  information 
and  is  the  only  thing  we  have.  So,  how  can  we  derive  back  the  appropriate  communities 
(or  the  community  indicators)  only  from  this  matrix?  This  is  where  NMF  comes  into  the 
picture  and  helps.  In  particular,  the  special  NMF  factorization  X  «  HSHJ  gives  us  H  and 
S  as  the  community  indicator  and  the  community  internal-strength  indicator  matrices, 
respectively.  In  this  example,  X  «  HSHT  factorization  realizes  S  =  l2  and 


/ 


\ 


1  1  1  .87  0  0  0  0 


yd  0  0  .89  0.99  0.99  0.99  0.99 


Matrix  H  clearly  indicates  that  nodes  1-3  should  be  in  a  community  and  nodes  5-8 
should  belong  to  another  one.  H  also  advises  that  node  4  should  be  an  overlapping 
node  due  to  its  significant  contribution  to  both  communities.  These  assignments 
indeed  reflect  the  true  nodes’  labels.  In  addition,  matrix  S  indicates  that  each  detected 
community  attains  its  perfect  internal  strength,  which  intuitively  agrees  with  the  original 
clique  structures.  This  illustrative  example,  though  simple,  motivates  the  application 
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of  the  NMF  factorization  X  «  HSHT  in  community  detection.  Note  that  when  X  is 
symmetric  (i.e.,  the  network  is  undirected),  S  is  also  symmetric  and  thus,  can  be  further 
absorbed  into  H  by  the  assignment  H  <-  HS1/2.  Hence,  the  problem  is  reduced  to 
X  «  HHT  only  when  X  is  symmetric. 

4.1.2  Problem  definitions 

In  order  to  quantify  the  goodness  of  the  approximation,  we  use  the  l-divergence 
(Kullback-Leibler  (KL)  divergence)  between  two  nonnegative  matrices  A  and  B 
suggested  by  [64]  as 


Due  to  the  inequality  x log x  >  x  -  1  Vx  >  0,  it  is  easy  to  see  D(A\\B)  is  lower  bounded 
by  zero  and  vanishes  if  and  only  if  A  =  B.  However,  unlike  the  Euclidean  distance,  this 
function  is  not  symmetric  in  A  and  B,  so  we  refer  to  it  as  the  “divergence”  from  A  to  B. 
The  smaller  the  divergence  between  A  and  B,  the  more  similar  they  are.  Therefore,  our 
objectives  seek  for  the  factorizations  X  w  HHT  and  X  «  HST H  such  that  D(X\\HHT) 
and  D(X\\HSHT)  are  minimized.  Formally,  the  problems  we  are  interested  in  can  be 
stated  as  follows  (here  the  little  “i”  comes  from  the  l-divergence) 

Problem  1  (iSNMF)  Given  a  nonnegative  symmetric  matrix  X,  find  a  matrix  H  >  0 
that  minimizes  DX(HHT)  =  D(X\\HHT) 

Problem  2  (iANMF)  Given  a  nonnegative  asymmetric  matrix  X,  find  matrices 
H,  S  >  0  that  minimize  DX(HSHT )  =  D(X\\HSHT) 

4.1.3  Properties  of  iSNMF  and  iANMF  factorizations 

By  Lemma  13,  we  give  important  properties  of  iSNMF  and  iANMF:  the  divergences 
DX(HHT)  and  DX(HSHT)  are  convex  in  S  only  or  H  only;  however,  they  are  not  convex 
in  both  variables  together.  Although  the  same  observations  have  been  proposed  for 
the  general  NMF  problem  on  both  Frobenius  and  l-divergence  cost  functions  [64],  no 


81 


claim  has  been  made  particularly  for  the  iSNMF  and  iANMF  problems,  especially  on  the 
l-divergence  function. 

Lemma  13.  The  divergences  DX(HHT )  and  DX(HSHT )  in  iSNMF  and  ANMF  are  convex 
in  H  or  S  only  but  not  in  both  S  and  H  together. 

Proof.  (Convexity  in  S)  Suppose  H  is  a  fixed  matrix.  For  any  number  a,  /3  e  [0, 1]  and 
a  +  f3  =  1,  we  have 

Dx(H(aSi  +  (3S2)Ht)  <  aDx(HSiHT)  +  f3Dx(HS2HT), 

if  and  only  if 

-  ^  log  (a[HS1HT]IJ  +  /3[HS2HT]y)  <  -a  X]  log  [HS.H^j  -  p  ^  log  [HS2HT]y 

ij  ij  ij 

for  any  matrices  S1.S2  >  0.  The  later  inequality  holds  true  due  to  the  convexity  of  -  log() 
function  and  Jensen’s  inequality.  Thus,  DX(HSHT)  is  convex  in  S  when  H  is  fixed. 
(Convexity  in  H)  Assume  S  is  a  fixed  matrix.  Rewrite 

Dx(HSHt)  =  Xy(log  Xy  -  1)  -  x,  log  [HSHT]y  +  ]T[HSHT]y 

ij  ij  ij 

.  Since  the  first  term  is  a  constant  and  -  log()  is  a  convex  function,  we  need  to  show  that 
the  last  term  is  also  convex  in  H.  Let  f{H)  =  '£liJ[HSHT]ij.  Now, 

af{Hx)  +  / 3f(H2 )  -  f(aH1  +  / 3H2 )  =  a/3  ^  ([Hi SH^  -  [H2SHj]u  -  [H.SHj^  +  [H2SH2T]y) 

ij 

=  a^^Hi  -  H2)S(H1  -  H2)t](7  >  0 
ij 

(since  S  >  0  and  EuiAA^ij  >  0  for  any  matrix  A).  This  implies  the  convexity  in  H  of 

DX(HSHT). 

The  convexity  of  H  in  iSNMF  is  derived  similarly  as  above  when  S  is  similar  to  /,  the 
identity  matrix.  The  nonconvexity  in  both  S  and  H  follows  from  the  general  NMF  case 
[64],  □ 
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The  above  properties  are  nontrivial  since  they  tell  us  it  is  unrealistic  to  solve  either 
iSNMF  or  iANMF  problem  for  global  minima,  and  consequently  give  us  the  hope  to  use 
other  techniques  such  as  Project  Gradient  [14],  Quasi-Newton  [108]  or  particularly  the 
Alternating  Lease  Square  (ALS)  [15]  methods  to  quickly  find  a  local  minima.  However, 
by  Lemma  14,  we  show  that  for  iSNMF  and  iANMF  problems,  employing  the  traditional 
ALS  does  not  provide  any  speed  up  since  we  can  neither  independently  update  the 
columns  of  S  nor  H  at  the  same  time,  thus  prevent  the  employment  of  this  technique  to 
our  problems. 

Lemma  14.  Employing  ALS  method  does  not  provide  any  speed  up  to  either  iSNMF  or 
iANMF. 

Proof.  Let  us  first  review  the  ALS  method’s  working  mechanism  on  the  general  NMF 
problem  X  «  WH.  Given  X  >  0,  the  ALS  method  does  the  following  steps  [5] 

1 .  Randomly  initialize  W}a  >  0,  >  0,  V/',  a,  b,j 

2.  For  k  =  1, 2, ...  alternatively  update  Wk+1  and  Hk+1  by 

Wk+1  =  arg  minw>o  Dx(WHk),  and  Hk+1  =  arg  miriH>o  Dx(Wk+1  H); 

The  main  idea  of  the  ALS  method  is  to  solve  each  minimization  problem  as  the 
collection  of  several  non-negative  independent  least  square  problems,  due  to  the 
uncorrelated  relationship  between  \N  and  H.  For  instance,  one  can  write  Hk+1  = 
arg  minH>0  D(X\\Wk+1  H)  as  Hk+1’s yth  column  =  minh>0  D(x\\Wk+1h),  where 
x  is  the  yth  column  of  X  and  h  is  a  column  vector  of  appropriate  size.  Therefore, 
each  sub-minimization  problem  requires  only  the  values  of  a  specific  column  and 
consequently  can  be  done  in  a  parallel  manner.  Since  H  and  HT  are  strongly  related,  it 
is  inappropriate  to  apply  ALS  method  to  iSNMF  problem.  For  ANMF  problem,  we  first 
note  that  [/-/S/-/T](J  =  Y^tk  HikHjtSkt,  which  implies  an  entry  in  HSHT  already  requires  all 
values  of  S  even  when  H  is  fixed.  Therefore,  should  one  try  to  update  a  single  column 
of  S  independently  as  suggested  in  S-phase  of  the  ALS  method,  he  has  to  repeatedly 
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solve  for  all  elements  Skt’s,  which  may  incur  even  more  computational  requirements. 
Thus,  the  conclusion  follows. 


□ 


4.2  The  Update  Rule  for  iSNMF 


4.2.1  Multiplicative  update  rule 

We  present  our  solution  for  iSNMF  when  the  input  matrix  X  is  symmetric.  Formally, 
given  a  nonnegative  symmetric  matrix  X  of  size  n  x  n  and  an  integer  number  K  <  n,  we 
need  to  find  a  nonnegative  matrix  H  of  size  n  x  K  such  that  DX(HHT )  =  D(X\\HHT)  is 
minimized. 

We  solve  this  problem  using  the  Karush-Kuhn-Tucker  (KKT)[13]  conditions.  In 
particular,  we  introduce  the  Lagrange  multipliers  au  for  the  constraints  H,j  >  0  and 
consider  the  objective  function  J  =  D{X\\HHT)  -  J2ijavHu’  or> 


The  KKT  conditions  require 


as  the  optimality  condition  and 


C^ab^ab  —  0 


as  a  complementary  slackness  condition  for  any  Hab. 

For  the  ease  of  computation,  we  construct  the  derivative  matrix  HHT  with  respect  to 
Hab  in  Figure  4-2.  For  each  position  (a,  b ),  this  derivative  matrix  is  zero  elsewhere  except 
for  the  ath  column  and  ath  row  whose  elements  are  from  the  bth  column  of  H.  Using  this 
matrix,  we  obtain 


(4-1) 
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ath  column 


Figure  4-2.  The  partial  derivative  matrix  of  HHT  with  respect  to  Hab 
Hence,  the  optimality  condition  implies 

«a£>  =  Hkb  —  Hkb  r//L/Tl  )’ 

V  V  tHH  J ak 

and  thus,  the  complementary  slackness  condition  requires 


2 


Xak 


[HHT] 


ak 


-)Hab  =  0, 


k  k 

which  suggests  the  following  update  rule 

Efc  HkbXak/[HHT]ak 


Hab  H. 


ab~ 


E  tH, 


tb 


(4-2) 


In  terms  of  projected  gradient  method,  the  rule  above  can  be  obtained  by  using  the 
update  rule 


Hab  Hab  —  uab 


dDx(HHT ) 


dHab  ’ 

with  the  magnitude  vab  set  to  some  appropriate  small  positive  number.  Here,  setting 

Hab 


Vab  = 


2EtH, 


tb 
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leads  to  the  same  update  rule  as  (4-2). 

The  iSNMF  community  detection  algorithm  is  described  in  Alg.  13.  Here,  n0  is 
the  maximum  number  of  iterations,  e  is  the  allowed  threshold  for  the  quality  of  iSNMF 
approximation  and  a  is  a  given  scale  to  determine  community  memberships.  We 
assume  that  K,  the  number  of  communities,  is  predetermined  or  given  as  part  of  the 
input.  Also,  the  choice  of  a  will  be  described  later. 


Algorithm  13  SNMF  for  community  detection 

Input:  Undirected,  unweighted  (weighted)  adjacent  matrix  X,  K,  n0,  e,  a; 
Output:  Community  indicator  matrix  H; 

1:  Initialize  H  to  be  a  random  nongnegative  matrix; 

2:  iter  <-  0; 

3:  while  ( iter  <  n0)  and  (DX(HHT)  >  7)  do 
4:  Update  Hab  <-  Hab  ■ 

5:  iter  4- iter  +  1] 

6:  end  while 

7:  %  Inferring  community  labels  from  H% 

8:  Cf,  •<—  0  V/)  =  1...K; 

9:  P  <—  normalized (H); 

10:  for  b  <r-  1  ...p  do 

11:  if  P(a,  b)  >  a*  max(P(a, :))  then 

12:  Cb  4—  Cb  U  {a}; 

13:  end  if 

14:  end  for 


Remark 

In  contrast  to  those  update  rules  found  in  [64],  we  have  shown  an  important  fact: 
These  rules  can  be  derived  similarly  for  this  special  case.  However,  our  multiplicative 
update  rule  (4-2)  is  not  trivial  in  the  sense  that  we  can  obtain  the  convergence  proof  for 
our  proposed  rule  whereas  one  may  find  it  inappropriate  to  adapt  the  proof  of  [64][16] 
which  assumed  absolutely  no  correlation  between  \N  and  H. 

Analysis 

We  provide  the  convergence  analysis  for  our  proposed  update  rule  (4-2)  using  an 
auxiliary  function  defined  as  follow: 
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(Auxiliary  function)  G(h,  h )  is  the  auxiliary  function  for  F(h)  if  the  conditions 
G(h,h )  >  F(h)  and  G(h,  h )  =  F(h)  are  satisfied 

Lemma  15.  [64]  If  G  is  an  auxiliary  function,  then  F  is  nonincreasing  under  the  update 

ht+1  =  arg  min/,  G(h,  h). 

To  prove  the  convergence  of  the  proposed  multiplicative  update  rule,  we  construct 
an  auxiliary  function  G{H,  H )  of  F(H )  =  DX(HHT )  as  follow 

S(H,  H)  =  J2  (*ii  'OS  X,i  ~  X,j  +  {HHJ],,)  -  J2  XU  ( log  HikHJk  -  log 

ij  ijk  Et  hi  it  hi jt  V  Et  hi  it  hi jt. 

Theorem  4.1.  The  divergence  DX(HHT)  is  nonincreasing  under  the  update  rule  (4-2) 
and  is  invariant  when  H  is  at  its  stationary  point  of  the  divergence. 


Proof.  When  H  =  H,  it  is  easy  to  verify  that  G(H,  H )  =  F(H),  thus  we  only  need  to 


check  G(H,  H )  >  F(H).  Now,  G(H,  H )  >  F(H)  iff 


Et  HuHjt 


log  HlkHjk  -  log 


Hjk  Hjk_  \ 

E  tH*Hjt) 


'°^HHTh  =  -  E  x*  '°g  ( E  H'kH^) 

ij  ij  k 

^  Y  x.  H‘k\  (\o  H’kHjkxyH,tHJt 
ijk  Et  HitHjt  V  HjkHjk 

>-J2xiJ  \og{J2HikHJk)- 

ij  k 


To  prove  the  above  inequality,  we  apply  Jensen’s  inequality  to  the  convex  function 

—  log  (Efc  HikHjk),  yielding 


-log  ^ 

k 


HikHjk 

ak - 

ak 


<  -  J^a*log 

k 


HikHjk 

Oik 


where  ak  =  aijk  =  E*  E  .  It  is  obvious  that  ak’s  are  nonnegative  and  sum  up  to  unity. 

z^t 

Thus,  we  have  G(H,  H)  >  F(H).  Taking  the  derivative  of  G(H,  H)  with  respect  to  H  also 
gives  the  same  update  rule  (4-2).  □ 
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4.2.2  Quasi-Newton  method  for  iSNMF 


One  of  the  problems  with  the  multiplicative  update  rule  is  its  slow  convergence, 
i.e.,  it  does  converge  to  (possible)  stationary  point  but  may  be  slow,  taking  more 
iterations  and  time,  as  well  as  easily  getting  into  local  minima  trap  [5].  One  way  to 
speed  up  the  convergence  is  to  adjust  the  learning  rate  in  a  sequential  manner,  using 
the  second-order  estimate  of  the  objective  function,  e.g.  the  Quasi-Newton  method.  In 
[108],  the  authors  already  addressed  this  method  for  the  general  NMF  X  s=s  WH  but  with 
the  uncorrelated  relationship  assumption  between  1/1/  and  H.  Obviously,  that  assumption 
does  not  hold  when  X  «  HHT  and  hence,  it  is  not  trivial  to  derive  proper  Quasi-Newton 
formulation  for  iSNMF  problem.  In  fact,  we  show  that  the  second-order,  or  Hessian, 
matrix  of  iSNMF  is  much  different  from  that  of  the  general  NMF. 

The  general  Quasi-Newton  method,  when  applied  to  iSNMF  problem,  takes  the  form 


(4-3) 


where  Dx  is  short  for  DX(HHJ),  is  the  n  x  K  first-order  matrix  of  DX(HHT )  w.r.t  H, 
is  the  nK  x  nK  second-order  derivative  (or  Hessian)  matrix  of  Dx  w.r.t  to  H  and  e 
is  a  small  nonnegative  number  to  enforce  the  nonnegativity  of  H.  Thanks  to  equation 
(4-1),  the  first-order  derivative  matrix  can  be  found  as 


where  1  is  a  N  x  N  matrix  of  all  1’s  and  ./  is  the  Hadamard  (element-wise)  division.  For 
any  pair  (/,_/),  the  Hessian  matrix  can  be  found  by:  [H(^x}],j  =  = 


( 
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There  are  two  important  differences  between  the  Hessian  UNMf  for  the  general 
case  [108]  and  Firstly,  'Hnmf’s  elements  are  zeros  everywhere  except  when 
a  =  /,  b  =  j  whereas  obtains  values  for  all  combinations  of  a,  b,  i  and  j.  Secondly, 
due  to  its  sparseness,  HNMF  can  be  written  under  matrix  block  form  while  H{^  might  not 
be,  particularly  when  it  is  a  full  matrix.  Therefore,  updating  H  in  iSNMF  problem  is  much 
more  complicated  than  usual  since  finding  in  (4-3)  shall  require  more  numerical 

computations. 

The  authors  in  [108]  also  proposed  a  numerical  technique  to  overcome  the 
ill-conditioned  Hessian  matrix  and  to  speed  up  the  computing  process,  which  we 
find  it  useful  when  applied  to  our  problems.  Here,  we  briefly  state  their  technique  so  that 
the  paper  is  self-contained  (note  that  (4-5)  and  (4-6)  are  not  our  equations):  To  reduce 
the  computational  cost,  the  inversion  of  the  Hessian  is  replaced  with  the  Q-less  QR 
factorization  computed  by  LAPACK.  The  final  form  of  the  Quasi-Newton  method  is 


H  «—  max  {H  —  7FV/I WH,  e} 


(4-5) 


(4-6) 


where  lH  is  the  nK  x  nK  identical  matrix,  7  =  10~12  and  A  =  0.9  are  the  small  fixed 
regularization  and  the  relax  parameters,  respectively.  The  |  operator  in  (4-5)  means  the 
Gaussian  elimination. 


4.3  Update  Rules  for  iANMF 


4.3.1  Multiplicative  update  rules 

In  this  section,  we  present  our  solution  for  the  iANMF  problem  when  X  is  not 
symmetric.  Formally,  given  a  nonegative  asymmetric  matrix  X  of  size  n  x  N,  we 
find  nonnegative  matrices  H  and  S  of  size  n  x  K  and  K  x  K,  respectively,  such  that 
DX(HSHT )  =  D(X\\HSHT)  is  minimized.  We  again  solve  this  problem  using  the 
KKT  conditions  by  introducing  the  Lagrange  multipliers  au  and  /3(J-  for  the  constraints 
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Hij  >  0  and  Sy  >  0,  respectively,  and  then  consider  the  objective  function  J  = 
D(X\\HSHT )  -  V ;  ijPijSij .  Equivalently,  J  can  be  written  as 


J  =  V  (x,j  log  ijJL-  -  X,J  +  [HSHT]„)  -  «< iH.j  -  £  PijSij 


The  KKT  conditions  require 


or  equivalently, 


dHab 

dDx(HSHT) 


dJ  dJ 

=  0  and  -  =  0, 


dS. 


ab 


dH. 


,  dDx(HSHT ) 

=  aab  and - - -  =  /3ab 


ab 


dS. 


ab 


as  the  optimality  conditions,  as  well  as 


cyabHab  —  0  and  (3abSab 

as  a  complementary  slackness  condition  for  any  Hab  and  Sab.  For  the  ease  of  computation, 
we  construct  the  matrix  for  finding  the  derivative  of  an  entry  [HS/-/T](J  with  respect  to  any 
Hab  in  Figure  4-3.  Here  r( A,  /)  and  c(Bj')  mean  the  ith  row  of  A  and jth  column  of  B, 
respectively.  Elements  outside  of  the  plotted  column  and  row  are  zeros.  The  elements 
of  this  matrix  are  zeros  elsewhere  except  for  the  ath  column  and  ath  row.  Using  this 
conventional  partial  derivative  matrix,  we  obtain 

^  Xij  log  [HSHT]ij  _  ^  [HS]kb  v  [SHT]bk 

M*  “V  ka[HSHT]ka  ^  akl HSH^U 

dy2:i[HSHT]ij  x  ^ 

and  - -  =  y,([HS]kb  +  [ SHJ]bk ) 

dHab  k 
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ath  column 


r(S,b)xc(HT,n) 


HSHT 

Figure  4-3.  The  partial  derivative  matrix  of  HSHT  with  respect  to  Hab 


Therefore, 


d  Dx(HSHT)  =  -dZXulogjHSH^j  +  ZiHSH^j 
dHab  dHab 

_  _  sr'  y  [HS]kb  _sr^  y  [ SHT]bk 

v  kalHSHT]ka  4"  aklH5HTU 
+  ;C([HSW  +  [SHtW).  (4-7) 

k 

The  optimality  condition  <9Px^hT)  =  uab  and  the  complementary  slackness  condition 
aabHab  =  0  together  give  the  following  update  rule  for  Hab 


u  ,  u  (EkXka[HS]kb/[HSHT]ka  J2kXak  [5HT]bk/[HSHT]ak\ 
ab  ab\  EtlHS]tb  +  [SHT]bt  +  Et[HS]tb  +  [SH^]bt  ) 

Alternatively,  this  update  rule  can  also  be  achieved  by  using  projected  gradient 
method,  in  particular  by  updating 


Hab  Hab  —  uab 


dDx(HSHT ) 


dH, 


ab 


with  the  magnitude 


"ab  = 


Hab 

^2t(iHS]tb  +  [SHT]bt) 
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Now  we  give  the  multiplicative  update  rule  for  any  Sab-  The  partial  derivative  of 


DX(HSHT )  w.r.t  Sab  is  derived  as 

dDx(HSHT )  _  I,  v 

sc”  -  2^  HsaHtb  ~  2^ 


sa^tb 


ab  st  st  [HSHT]st 


The  KKT  conditions  8Px1"shT)  =  /3a/>  and  l3abSab  =  0  together  imply  the  following  update 


9Sab 


rule  for  S. 


ab 


Sab  4  Sab  - 


Est  HsaHtb(Xst/[HSHT]st) 


E  St  HsaHtb) 

Alternatively,  this  rule  can  be  derived  by  the  projected  gradient  method 

dDx(HSHT ) 


(4-9) 


•Sab  ^  -Sab  ^ab" 


as. 


ab 


with  magnitude 

_  Sab 

lab  u  u  ' 

2^st  rlsaHtb 

The  iANMF  community  detection  is  presented  in  Alg.  14.  The  parameters  and  their 
meanings  in  this  case  are  similar  to  those  described  in  the  SNMF  case. 


Algorithm  14  iANMF  for  community  detection 

Input:  Directed,  unweighted  (weighted)  adjacent  matrix  X,  K,  n0,  e,  a; 

Output:  Matrices  H  and  S  and  the  inferred  community  labels; 

1:  Initialize  H  and  S  to  be  a  random  nongnegative  matrices; 

2:  iter  «—  0 

3:  while  ( iter  <  n0)  and  ( DX(HHT )  >  7)  do 

4:  Update  Hab  based  on  equation  (4-8); 

5:  Update  Sab  based  on  equation  (4-9); 

6:  iter  4-  iter  +  1; 

7:  end  while 

8:  %  Inferring  community  labels  from  H% 

9:  C*  «-  0  =  1...K; 

10:  P  <—  normalized (H); 

11:  for  b  4-  I  K  do 

12:  if  P(a,  b)  >a*  max(P(a, :))  then 

13:  Cb  4—  Cb  U  {a}; 

14:  end  if 

15:  end  for 
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Summary 

With  the  multiplicative  update  rules  (4-8)  and  (4-9),  we  give  the  complete  steps 
for  iteratively  solving  iANMF  problem  with  respect  to  the  l-divergence.  These  rules  are 
different  from  what  have  been  discovered  in  prior  studies  and,  to  our  knowledge,  have 
not  yet  been  derived  in  the  literature.  Thus,  they  are  our  contributions  in  this  paper. 

Analysis 

We  first  show  the  following  result 

Theorem  4.2.  At  the  stationary  point  (H,  S)  of  DX(HSHT),  KKT  conditions  imply  that 

Ex*  =  E!hshTi* 

St  St 

Proof.  We  show  that  the  condition  Sab0Dx^HJ)  =  0  imply  the  above  equality.  In 
particular,  the  KKT  condition  equals  to 

smJ2h^h»>  =  Ex*i 

a  a  iHSH  1“ 

Summing  over  all  a  and  b  of  the  LHS  gives 

Y  sab  Y  HsaHtb  =  Y  E  H-safeHtfc  =  YiHSHJ]st- 

ab  st  st  at>  st 

Similarly,  summing  over  all  a  and  b  of  the  RHS  gives 

HsaHtb  _sr^  y  [ HSHT]st  _ 

EE  *[HSHT]st  2y  *[HSHT]a  ^ 

Therefore,  the  equality  follows.  □ 

We  next  analyze  the  convergence  analysis  of  our  proposed  rules  (4-8)  and  (4-9). 

By  using  appropriate  auxiliary  functions  G(S,  S)  and  G(H,  H),  one  can  show  the 
following 

Theorem  4.3.  The  divergence  D(X\\HSHT)  is  nonincreasing  under  the  update  rules 
(4-8)  and  (4-9)  and  is  invariant  if  and  only  if  S  and  H  are  at  their  stationary  points  in  the 
divergence. 
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Proof.  The  proof  of  convergence  for  the  two  update  rules  (4-8)  and  (4-9)  is  similar  to 
Theorem  4.1 .  Let  us  first  define  two  functions 

G(S,  S)  =  xu('°9  Xu-  1)  +  Y[HSHT],j  -  E  xuMog  HIVSVUHJU  -  log  (3ijuv), 

ij  ij  ij 

and  G(H,  H)  =  J2  Xu0° 9  Xu  ~  !)  +  -  Y  X^„(log  HIVSVUHJU  -  log  £/*,„)■ 

ij  ij  ij 

where  ft*.  =  ",vS"F"  ,  . 

V„  KtS„H„  Es,H,StsH,s 

It  is  clear  that  each  /%,,/s  and  Quiz’s  are  nonzero  and  sum  up  to  unity.  We  now  prove 
the  convergence  of  rule  (4-9)  for  S  when  matrix  H  is  fixed.  Let  F(S)  =  DX(HSHT).  We 
show  that  G(S,  S)  defined  above  is  an  auxiliary  function  for  F(S).  When  S  =  S,  one 
can  verify  that  G(S,  S)  =  F(S),  thus  we  need  to  check  G(S,  S)  >  F(H).  This  inequality 
equals  to 

-  Xu  >°g  iHSHT]ij  <  -  Y  Xut%uv(\og  HivSvuHJU  -  log  piJuv) 
ij  ij 

By  the  definition  of  8iJUV,  one  can  rewrite  the  above  inequality  as 

.  \  ^  n  HivSuvHju  \  ^  Q  .  HivSvuHjU 

-  log  Y  ft*”  — r —  -~Y  P'JUV  log  — r — 

Ij  Pijuv  ij  Pijuv 

which  generally  holds  true  due  to  Jensen’s  inequality  and  the  convexity  of  -  log() 
function.  Now,  taking  the  derivative  of  G(S,  S)  with  respect  to  S  gives  the  update  rule 
(4-9).  The  proof  for  H  can  be  obtained  in  a  very  similar  manner  with  and  thus,  is  omitted 
here.  □ 

4.4  Experimental  Results 

In  this  section,  we  first  validate  our  approaches  on  different  synthesized  networks 
with  known  ground-truths,  and  then  present  our  findings  on  real-world  traces  including 
the  Enron  email  [98]  and  Facebook  social  network  [87],  To  certify  our  performance,  we 
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compare  the  results  to  two  NMF  methods  proposed  in  [102]  (i.e. ,  wSNMF  and  wANMF), 
and  the  recently  suggested  Bayesian  NMF  [90]  (i.e.,  Bayesian  method). 

Our  methods  require  the  number  of  communities  K  as  an  input  parameter.  We 
stress  that  determining  this  quantity  is  not  the  main  focus  of  NMF-based  detection 
methods  since  almost  all  of  them  rely  on  a  predefined  K  to  discover  the  network 
communities,  as  commonly  observed  in  [102][67][70].  Thefore,  this  quantity  K 
is  predetermined  using  a  procedure  suggested  in  [80],  which  has  been  shown  to 
well-predict  the  number  of  network  communities  in  a  timely  manner.  We  also  use  this 
value  as  input  for  wSNMF  and  wANMF.  For  the  Bayesian  method,  we  keep  the  default 
settings  as  provided  in  its  deliverable. 

4.4.1  Empirical  results  on  synthesized  networks 

Of  course,  the  best  way  to  evaluate  our  approaches  is  to  validate  them  on  real-world 
networks  with  known  community  structures.  Unfortunately,  we  often  do  not  know  that 
structures  beforehand,  or  such  structures  cannot  be  easily  mined  from  the  network 
topologies.  Although  synthesized  networks  might  not  reflect  all  the  statistical  properties 
of  real  ones,  they  can  provide  us  the  known  ground-truths  via  planted  communities  and 
the  ability  to  vary  other  network  parameters  such  as  sizes,  densities  and  overlapping 
levels,  etc.  Testing  community  detection  methods  on  generated  data  has  becomes  a 
usual  practice  that  is  widely  accepted  in  the  field  [58].  Therefore,  running  iSNMF  and 
iANMF  on  synthesized  networks  not  only  certifies  their  performance  but  also  provides  us 
the  confidence  to  their  behaviors  when  applied  to  real-world  traces. 

Set  up:  We  use  the  well-known  LFR  overlapping  benchmark  [57]  to  generate  22 
weighted  directed  and  undirected  testbeds.  Generated  networks  follow  the  power-law 
degree  distribution  and  contain  embedded  overlapping  communities  of  varying  sizes 
that  capture  the  internal  characteristics  of  real-world  networks.  Parameters  are:  the 
number  of  nodes  N  =  1000,  the  mixing  parameter  /x  =  0.1  and  0.3  controlling  the  overall 
sharpness  of  the  community  structure,  the  weight  mixing  nw  =  0.1  and  0.3,  the  minimum 
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Figure  4-4.  Normalized  Mutual  Information  scores  on  synthesized  networks 

and  maximum  community  sizes  cmin  =  10  and  cmax  =  50,  the  maximum  memberships  of 
a  node  om  =  2,  and  the  overlapping  fraction  7  e  [0, 0.5]  measuring  the  fraction  of  nodes 
with  memberships  in  more  than  communities.  We  set  the  number  of  iterations  to  400  in 
all  methods  and  run  22  tests  100  times  for  consistency. 

Metric:  To  measure  the  similarity  between  detected  communities  and  the 
embedded  ground-truth,  we  evaluate  Generalized  Normalized  Mutual  Information 
(NMI)  [56].  NMI(U,  V )  is  1  if  structures  U  and  V  are  identical  and  is  0  if  they  are  totally 
separated.  This  is  the  most  important  metric  for  a  community  detection  algorithm 
because  it  indicates  how  good  the  algorithm  is  in  comparison  with  the  true  communities. 
The  higher  the  NMI  value  to  the  ground-truth,  the  better. 

Detection  quality:  As  depicted  in  Figure  4-4,  our  approaches  iSNMF  and  iANMF 
achieve  the  most  stable  and  competitive  (if  not  to  say  the  best)  NMI  scores  on  both 
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Figure  4-5.  Number  of  communities  on  synthesized  networks 

weighted  directed  and  undirected  networks.  In  particular,  on  undirected  networks  (top 
2  figures),  NMI  scores  produced  by  iSNMF  are  highly  competitive  to  those  of  wSNMF 
and  are  up  to  84%  better  than  those  returned  by  the  Bayesian  method.  Moreover,  its 
NMI  scores  still  remains  high  and  balance  as  the  mixing  overlapping  ratio  7  increases. 
This  means  the  communities  discovered  by  iSNMF  are  consistently  of  high  similarity  to 
the  ground-truth  even  when  more  and  more  network  communities  are  overlapped  with 
each  other.  wSNMF  also  displays  these  properties  on  undirected  networks;  however,  its 
performance  degrades  significantly  on  directed  weighted  networks,  as  we  will  discuss 
shortly.  The  Bayesian  method,  on  the  other  hand,  produces  very  low  NMI  values  that 
tend  to  decrease  quickly  as  7  increases.  This  implies  communities  detected  by  this 
method  are  not  ideally  coincident  with  the  embedded  ones,  especially  when  they  highly 
overlap  with  each  other. 
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There  is  a  close  relationship  between  the  number  of  communities  and  the 
identification  capacity  that  we  observed  in  the  case  of  undirected  networks  in  Figure 
4-5.  As  revealed  in  its  top  figures,  the  input  numbers  of  communities  for  iSNMF  and 
iSNMF  are  almost  identical  to  the  ground-truth  when  n  =  0.1  and  slightly  deviate  from 
them  when  ji  =  0.3,  while  those  of  the  Bayesian  method  are  far  away  from  the  baseline. 
This  close  relationship,  as  a  result,  helps  iSNMF  and  wSNMF  to  determine  a  proper 
number  of  basic  features  and  consequently,  indicate  more  appropriate  community  labels. 
However,  this  observation  does  not  appear  to  hold  for  wANMF  on  directed  networks 
since  it  performs  poorly  whereas  our  approach  iANMF  still  performs  excellently  on 
this  type  of  networks  (Figure  4-4,  bottom  figures).  The  big  gap  between  the  Bayesian 
method  and  the  ground-truth  implies  its  built-in  estimate  of  the  number  of  communities 
could  potentially  mislead  the  factorization,  thus  results  in  its  low  NMI  scores. 

The  superiority  of  our  iANMF  approach  becomes  more  visible  on  directed  weighted 
networks  (Figure  4-4,  bottom  figures).  In  these  figures,  iANMF  returns  the  best 
stable  NMI  values  and  they  remain  wealthy  even  when  7  evolves,  i.e.,  when  strongly 
overlapped  communities  appear.  In  particular,  the  NMI  scores  returned  by  iANMF 
are  more  than  twice  those  of  wANMF  and  are  up  to  10%  those  of  Bayesian  method. 

The  performance  of  wANMF,  surprisingly,  reduces  to  no  more  than  half  of  its  prior 
achievement  even  when  fed  with  the  relatively  close  number  of  true  communities 
(bottom  figures  of  Figure  4-5).  This  in  turn  indicates  the  communities  discovered  by 
wANMF  are  heavily  deviated  from  and  are  of  very  low  similarity  to  the  ground-truth. 
Bayesian  method’s  performance  is  somehow  the  same  on  these  directed  networks  with 
average  NMI  scores  tend  to  quickly  decrease  in  the  long  run.  This  comparison  among 
three  NMF  factorizations  reveals  that  iSNMF  and  iANMF  are  the  best  ideal  methods 
for  effectively  recovering  the  overlapped  network  community  structures,  especially  on 
weighted  and  directed  networks. 
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Figure  4-6.  Running  Time  on  synthesized  networks 

We  next  compare  the  running  time  of  three  methods.  As  reported  in  Figure  4-6, 
the  running  times  of  iSNMF  and  wSNMF  on  undirected  networks  are  fairly  similar  to 
each  other  (at  most  2s  difference)  and  are  much  less  than  the  huge  time  requirement 
of  the  Bayesian  method.  In  average,  the  Bayesian  method  requires  almost  200s  in 
order  to  finish  the  test  whereas  iSNMF  and  wSNMF  only  ask  for  roughly  16s  and 
14s,  respectively.  On  directed  networks,  iANMF  requires  nearly  the  same  amount 
of  time  of  the  Bayesian  method  and  much  more  time  than  wANMF.  Note  that  this 
time  consumption  of  iANMF  is  quite  understandable  because  each  update  for  Sab 
in  equation  (4-9)  based  on  the  l-divergence  already  took  0(/?2)  time.  However,  the 
superiority  of  its  produced  NMI  scores  to  other  competitors  makes  iANMF  a  promising 
approach,  especially  suited  for  those  who  strive  to  discover  excellent  network  community 
structures. 
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Figure  4-7.  The  number  of  communities,  Internal  density  and  Overlapping  ratio  of  Enron 
email  and  Facebook-like  datasets 


In  summary,  comparisons  among  three  algorithms  on  generated  networks  show 
that  (1 )  iSNMF  is  among  the  best  NMF  methods  for  efficiently  identify  high  quality 
overlapping  communities  in  weighted  and  unweighted  undirected  networks  (2)  iANMF  is 
the  best  among  three  methods  for  analyzing  weighted  and  directed  networks  containing 
highly  overlapped  communities,  despite  it  long  running  time.  More  importantly,  the 
performance  of  both  approaches  remains  healthily  stable  even  when  more  and  more 
overlapping  communities  are  introduced.  These  results  provide  us  the  strong  confidence 
when  applying  iSNMF  and  iANMF  to  analyze  the  real-world  traces. 

4.4.2  Results  on  real  networks 

We  next  utilize  iANMF  and  iSNMF  to  analyze  the  real  network  datasets  and  present 
our  findings  on  their  overlapping  structures.  In  particular,  we  choose  the  Enron  email 
dataset  and  the  Facebook-like  social  network  [87],  The  Enron  email  network  contains 
email  messages  data  from  about  150  users,  mostly  senior  management  of  Enron  Inc., 
from  Jan  1999  to  July  2002  [98].  Each  email  address  is  represented  by  an  unique 
identification  number  in  the  dataset  and  each  link  corresponds  to  a  message  sent 
between  the  sender  and  the  receiver.  The  Facebook-like  social  network  is  collected  from 
students  of  University  of  California,  Irvine.  The  dataset  contains  20296  messages  sent 
and  received  among  1899  users.  The  number  of  communities  inputed  for  Enron  email 
and  Facebook-like  datasets  are  set  to  8  and  18,  respectively. 


100 


We  are  interested  in  understanding  their  overlapping  structures  and  what  the 
overlapping  nodes  really  mean  to  them,  particularly  in  the  top  5  biggest  communities.  As 
revealed  in  Figure  4-7,  the  numbers  of  members  in  top  5  communities  of  Facebook-like 
network  are,  not  surprisingly,  much  bigger  are  those  of  the  Enron  email  network. 
However,  the  internal  density,  i.e. ,  the  inner  structures  of  those  top  5  communities  in 
Enron  emails  are  much  stronger  than  those  of  Facebook  networks.  Indeed,  the  density 
values  of  Enron  email  communities  are  more  than  twice  of  Facebook  networks.  This  can 
be  explained  as  email  communication  in  a  work  place  among  managers  occurs  much 
more  frequently  than  messages  on  a  social  environment  like  the  Facebook  network. 

We  next  investigate  on  the  overlapping  substructures  of  these  real  networks,  i.e.,  we 
want  to  know  how  much  they  are  overlapped  and  what  the  overlapped  nodes  mean  to 
the  communities.  As  described  in  Figure  4-7,  all  5  top  communities  of  Facebook  network 
are  highly  overlapped  whereas  just  3  top  communities  of  Enron  email  network  appear 
to  have  this  properties.  Moreover,  overlapped  nodes  on  Facebook  network  tend  to  be 
active  users  who  eagerly  participate  in  multiple  communities  at  the  same  time,  i.e.,  they 
send  messages  to  multiple  friends  in  different  groups.  Overlapped  nodes  on  Enron  email 
network,  though  fewer,  suggest  that  they  potentially  play  vital  roles  in  the  company  since 
most  of  them  communicate  frequently  many  other  members  in  all  of  the  communities. 
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CHAPTER  5 

SOCIAL-AWARE  ROUTING  STRATEGIES  IN  MOBILE  AD-HOC  NETWORKS 
In  this  chapter,  we  demonstrate  the  applicability  of  our  proposed  detection 
algorithms  QCA  and  AFOCS  as  the  community  identification  cores  in  forwarding 
and  routing  strategies  in  mobile  dynamic  networks.  In  the  following  paragraphs,  we  first 
present  the  application  of  QCA  and  then  describe  how  AFOCS  can  help  to  improve  the 
performance  of  this  practical  applications. 

5.1  A  Message  Forwarding  and  Routing  Strategy  Employing  QCA 
In  a  broad  view,  a  MANET  is  a  dynamic  wireless  network  with  or  without  the 
underlying  infrastructure,  in  which  each  node  can  move  freely  in  any  direction  and 
organize  itself  in  an  arbitrary  manner.  Due  to  nodes  mobility  and  unstable  links  nature  of 
a  MANET,  designing  an  efficient  routing  scheme  has  become  one  of  the  most  important 
and  challenging  problems  on  MANETs.  Recent  researches  have  shown  that  MANETs 
exhibit  the  properties  of  social  networks  [46][1 9][1 0]  and  social-aware  algorithms 
for  network  routing  are  of  great  potential.  This  is  due  to  the  fact  that  people  have  a 
natural  tendency  to  form  groups  or  communities  in  communication  networks,  where 
individuals  inside  each  community  communicate  with  each  other  more  frequent  than 
with  people  outside.  This  social  property  is  nicely  reflected  to  the  underlying  MANETs 
by  the  existence  of  groups  of  nodes  where  each  group  is  densely  connected  inside  than 
outside.  This  resembles  the  idea  of  community  structure  in  Mobile  Ad  hoc  Networks. 

Multiple  routing  strategies  [1 9]-[45]  based  on  the  discovery  of  network  community 
structures  have  provided  significant  enhancement  over  traditional  methods.  However, 
the  community  detection  methods  utilized  in  those  strategies  are  not  applicable  for 
dynamic  MANETs  since  they  have  to  recompute  the  network  structure  whenever 
changes  to  the  network  topology  are  introduced,  which  results  in  significant  computational 
costs  and  processing  time.  Therefore,  employing  an  adaptive  community  structure 
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detection  algorithm  as  a  core  will  provide  a  speedup  as  well  as  robust  to  routing 
strategies  in  MANETs. 

We  evaluate  five  routing  strategies  (1)  WAIT:  the  source  node  waits  until  it  meets 
the  destination  node  (2)  MCP:  A  node  keeps  forwarding  the  messages  until  they  reach 
the  maximum  number  of  hops  (3)  LABEL:  A  node  forwards  or  sends  the  messages  to  all 
members  in  the  destination  community  [46]  (4)  QCA:  A  Label  version  utilizing  QCA  as 
the  dynamic  community  detection  method  and  lastly,  (5)  MIEN:  A  social-aware  routing 
strategy  on  MANETs  [26]. 

Even  though  WAIT  and  MCP  algorithms  are  very  simple  and  straightforward  to 
understand,  they  provide  us  helpful  information  about  the  lower  and  upper  bounds  on 
the  message  delivery  ratio,  time  redundancy  as  well  as  message  redundancy.  The 
LABEL  forwarding  strategy  works  as  follow:  it  first  finds  the  community  structure  of  the 
underlying  MANET,  assigns  each  community  with  the  same  label  and  then  exclusively 
forwards  messages  to  destinations,  or  to  next-hop  nodes  having  the  same  labels  as  the 
destinations.  MIEN  forwarding  method  utilizes  MIEN  algorithm  as  a  subroutine.  QCA 
routing  strategy,  instead  of  using  a  static  community  detection  method,  employs  QCA 
algorithm  for  adaptively  updating  the  network  community  structure  and  then  uses  the 
newly  updated  structure  to  inform  the  routing  strategy  for  forwarding  messages. 

5.1.1  Setup 

We  choose  Reality  Mining  data  set  [29]  provided  by  the  MIT  Media  Lab  to  test  our 
proposed  algorithm.  The  Reality  Mining  data  set  contains  communication,  proximity, 
location,  and  activity  information  from  100  students  at  MIT  over  the  course  of  the 
2004-2005  academic  year.  In  particular,  the  data  set  includes  call  logs,  Bluetooth 
devices  in  proximity,  cell  tower  IDs,  application  usage,  and  phone  status  (such  as 
charging  and  idle)  of  the  participated  students  of  over  350,000  hours  (40  years).  In  this 
paper,  we  take  into  account  the  Bluetooth  information  to  form  the  underlying  MANET 
and  evaluate  the  performance  of  the  above  five  routing  strategies. 
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Figure  5-1.  Experimental  results  on  the  Reality  Mining  data  set 
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5.1.2  Results 


For  each  routing  method,  we  evaluate  the  followings  (1)  Delivery  ratio:  The  portion 
of  successfully  delivered  over  the  total  number  of  messages  (2)  Average  delivery 
time:  Average  time  for  a  message  to  be  delivered.  (3)  Average  number  of  duplicated 
messages  for  each  sent  message.  In  particular,  a  total  of  1000  messages  are  created 
and  uniformly  distributed  during  the  experiment  duration  and  each  message  can  not 
exist  longer  than  a  threshold  time-to-live.  The  experimental  results  are  shown  in  Figure 
5  —  1A,  5  —  16  and  5  -  1C. 

Figure  5-1 A  describes  the  delivery  ratio  as  a  function  of  time-to-live.  As  revealed 
by  this  figure,  QCA  achieves  much  better  delivery  ratio  than  MIEN  as  well  as  LABEL  and 
far  better  than  WAIT.  This  means  that  QCA  routing  strategy  successfully  delivers  many 
more  messages  from  the  source  nodes  to  the  destinations  than  the  others.  Moreover,  as 
time-to-live  increases,  the  delivery  ratio  of  QCA  tends  to  approximate  the  ratio  of  MCP, 
the  strategy  with  highest  delivery  ratio. 

Comparison  on  delivery  time  shows  that  QCA  requires  less  time  and  gets 
messages  delivered  successfully  faster  than  LABEL,  as  depicted  in  Figure  5  1C. 

It  even  requires  less  delivery  time  in  comparison  with  the  social-aware  method  MIEN. 
This  can  be  explained  as  the  static  community  structures  in  LABEL  can  possibly  get 
message  forwarded  to  a  wrong  community  when  the  destinations  eventually  change 
their  communities  during  the  experiment.  Both  QCA  and  MIEN,  on  the  other  hand, 
captures  and  updates  the  community  structures  on-the-fly  as  changes  occur,  thus 
achieves  better  results. 

The  numbers  of  duplicate  messages  presented  in  Figure  5  -  IB  indicate  that  both 
QCA  and  MIEN  achieves  the  best  results.  The  numbers  of  duplicated  messages  of  MCP 
method  are  substantially  higher  than  those  of  the  others  and  are  not  plotted.  In  fact,  the 
results  of  QCA  and  MIEN  are  relatively  close  and  tend  to  approximate  each  other  as 
time-to-live  increases. 
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In  conclusion,  QCA  is  the  best  social-aware  routing  algorithm  among  five  routing 
strategies  since  its  delivery  ratio,  delivery  time,  and  redundancy  outperform  those  of 
the  other  methods  and  are  only  below  MCP  while  the  number  of  duplicate  messages  is 
much  lower.  QCA  also  shows  a  significant  improvement  over  the  naive  LABEL  method 
which  uses  a  static  community  detection  method  and  thus,  confirms  the  applicability  of 
our  adaptive  algorithm  to  routing  strategies  in  MANETs. 

5.2  A  Message  Forwarding  and  Routing  Strategy  Employing  AFOCS 

We  present  a  practical  application  where  the  detection  of  overlapping  network 
communities  plays  a  vital  role  in  forwarding  strategies  in  communication  networks.  With 
the  helpful  knowledge  of  the  network  community  structure  discovered  by  AFOCS,  we 
propose  a  new  community-based  forwarding  algorithm  that  significantly  reduces  the 
number  of  duplicate  messages  while  maintaining  competitive  delivery  times  and  ratios, 
which  are  essential  factors  of  a  forwarding  strategy. 

5.2.1  Message  forwarding  strategy 

Let  us  first  discuss  how  our  new  forwarding  algorithm  works  in  practice  and  then 
how  AFOCS  helps  it  to  overcome  the  above  limitations.  We  use  AFOCS  to  detect 
overlapping  communities  and  keep  it  up-to-date  as  the  network  changes.  Each  node 
in  a  community  is  assigned  the  same  label  and  each  overlapped  node  u  has  a  set  of 
corresponding  labels  Com(u).  During  the  network  operation,  if  a  devices  u  carrying  the 
message  meets  another  device  v  who  indeed  shares  more  common  community  labels 
with  the  destination  than  u,  i.e.,  | Com(v)  n  Com(dest) \  >  \ Com(u)  n  Com(dest) |,  then 
u  will  forward  the  message  to  v.  The  same  actions  then  apply  to  v  as  well  as  to  devices 
that  v  meets. 

The  intuition  behinds  this  strategy  is  that  if  v  shares  more  communities  with  the 
destination  nodes,  it  is  likely  that  v  will  have  more  chances  to  deliver  the  message 
to  the  destination.  By  doing  in  this  way,  we  not  only  have  higher  chances  to  correctly 
forward  the  messages  but  also  generate  much  less  duplicate  messages.  Due  to  its 
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adaptive  nature  and  the  ability  of  identifying  overlapping  communities,  AFOCS  helps 
our  algorithm  to  overcome  the  above  shortcomings  naturally.  This  explains  why  our 
forwarding  algorithm  can  significantly  reduce  the  number  of  duplicate  messages  while 
maintaining  very  competitive  delivery  times  and  ratios. 

5.2.2  Setup 

We  compare  six  forwarding  strategies  (1)  MIEN:  A  recently  proposed  social-aware 
routing  strategy  on  MANETs  [26]  (2)  LABEL:  A  node  will  forward  the  messages  to 
another  node  if  it  is  in  the  same  community  as  the  destination  [46]  (3)  WAIT:  The  source 
node  waits  and  keeps  forwarding  the  message  until  it  meets  the  destination  (4)  MCP:  A 
node  keeps  forwarding  the  messages  until  they  reach  the  maximum  number  of  hops  (5) 
QCA:  A  LABEL  version  utilizing  QCA  [81]  as  the  adaptive  disjoint  community  detection 
method  and  lastly  (6)  AFOCS:  Our  newly  proposed  forwarding  algorithm  equipped  with 
AFOCS  as  an  community  detection  and  update  core. 

Results  of  WAIT  and  MCP  algorithms  provide  us  the  lower  and  upper  bounds  of 
important  factors:  message  delivery  ratio,  time  redundancy  and  message  redundancy. 
Our  experiments  are  performed  on  the  Reality  Mining  dataset  provided  by  the  MIT 
Media  Lab  [29].  This  dataset  contains  communication,  proximity,  location,  and  activity 
information  from  100  students  at  MIT  over  the  course  of  the  2004-2005  academic 
year.  In  particular,  we  take  into  account  the  Bluetooth  information  to  construct  the 
underlying  communication  network  and  evaluate  the  performance  of  the  above  six 
routing  strategies. 

In  each  experiment,  500  message  sending  requests  are  randomly  generated  and 
distributed  in  different  time  points.  To  control  the  forwarding  process,  we  use  hop-limit, 
time-to-live,  and  max-copies  parameters.  A  message  cannot  be  forwarded  more  than 
hop-limit  hops  in  the  network  or  exist  in  the  process  longer  than  time-to-live,  otherwise 
it  will  be  automatically  discarded.  Moreover,  the  maximum  number  of  same  messages 
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Figure  5-2.  Experimental  results  on  the  Reality  Mining  data  set 
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a  device  can  forward  to  the  others  is  restricted  by  max-copies.  Experiments  results  are 
repeated  and  results  are  averaged  for  consistency. 

5.2.3  Results 

Our  results  are  presented  in  Figures  5-2A,  5-2B,  5-2C.  The  first  observation 
reveals  that  our  proposed  forwarding  algorithm  achieves  the  lowest  number  of  duplicate 
messages  as  depicted  in  Figure  5-2A,  and  even  far  better  than  the  second  best  method 
QCA.  On  average,  only  46.5  duplicate  messages  are  generated  by  AFOCS  during 
evaluation  process  in  contrast  with  212.2  of  QCA,  274.2  of  MIEN,  496.4  of  LABEL 
and  the  huge  1071 .0  overhead  messages  of  MCP.  Thus,  on  the  number  of  duplicate 
messages,  AFOCS  strikingly  achieves  improvement  factors  of  4.5x,  5x,  1  lx  and  23x 
over  these  mentioned  strategies,  respectively.  These  extremely  low  overhead  strongly 
imply  the  efficiency  of  AFOCS  in  communication  networks. 

Figures  5-2B  and  5-2C  present  our  results  on  the  other  two  important  factors, 
the  message  delivery  ratios  and  delivery  times.  These  figures  supportively  indicate 
that  AFOCS  achieves  competitive  results  on  both  of  these  vital  factors.  In  general, 
AFOCS  is  the  second  best  strategy  with  almost  no  noticeable  different  between  itself 
and  the  leader  method  LABEL.  On  average,  AFOCS  gets  33%  of  the  total  messages 
delivered  in  3569.2s  and  only  a  little  bit  lags  over  MCP  (34%  in  3465.3s)  and  LABEL 
(slightly  over  33%  in  3462.7s),  and  is  far  better  than  MIEN  (32%  in  3537.6s)  and  QCA 
(32%  in  3572.2s).  This  can  be  explained  by  the  advantages  of  knowing  the  overlapping 
community  structure:  the  disjoint  network  communities  in  QCA  and  MIEN  can  possibly 
have  messages  forwarded  to  the  wrong  communities  when  the  destination  changes 
its  membership.  With  the  ability  of  quickly  updating  the  network  structure,  AFOCS  can 
efficiently  cope  with  this  change  and  thus,  can  still  provide  the  most  updated  forwarding 
information. 

In  summary,  AFOCS  helps  our  forwarding  strategy  to  reduce  up  to  1  lx  the  number 
of  duplicate  messages  while  keeping  good  average  delivery  ratio  and  time.  These 
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experimental  results  are  highly  competitive  and  supportively  confirm  the  effectiveness  of 
AFOCS  and  our  new  routing  algorithm  on  communication  networks. 
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CHAPTER  6 

SOLUTIONS  FOR  WORM  CONTAINMENT  IN  ONLINE  SOCIAL  NETWORKS 

In  this  section,  we  present  another  practical  application  of  our  proposed  algorithms 
in  worm  containment  problem  in  OSNs.  We  first  suggest  a  solution  based  on  QCA, 
and  then  describe  how  AFOCS  can  help  to  improve  the  performance  of  this  solution 
for  this  practical  problem  in  complex  networks.  Since  their  introduction,  popular  social 
network  sites  such  as  Facebook,  Twitter,  Bebo,  and  MySpace  have  attracted  millions  of 
users  worldwide,  many  of  whom  have  integrated  those  sites  into  their  everyday  lives. 

On  the  bright  side,  OSNs  are  ideal  places  for  people  to  keep  in  touch  with  friends  and 
colleagues,  to  share  their  common  interests,  or  just  simply  to  socialize  online.  However, 
on  the  other  side,  social  networks  are  also  fertile  grounds  for  the  rapid  propagation  of 
malicious  softwares  (such  as  viruses  or  worms)  and  false  information. 

Facebook,  one  of  the  most  famous  social  sites,  experienced  a  wide  propagation  of 
a  trojan  worm  named  “Koobface”  in  late  2008.  Koobface  made  its  way  not  only  through 
Facebook  but  also  Bebo,  MySpace  and  Friendster  social  networks  [31  ][53].  Once 
a  user’s  machine  is  infected,  this  worm  scans  through  the  current  user’s  profile  and 
sends  out  fake  messages  or  wall  posts  to  everyone  in  the  user’s  friend  list  with  titles 
or  comments  to  appeal  to  people’s  curiosity.  If  one  of  the  user’s  friends,  attracted  by 
the  comments  without  a  shadow  of  doubt,  clicks  on  the  link  and  installs  the  fake  “flash 
player”,  his  computer  will  be  infected  and  Koobface’s  life  will  then  cycle  on  this  newly 
infected  machine. 

Worm  containment  problem  becomes  more  and  more  pressing  in  OSNs  as  this  kind 
of  networks  evolves  and  changes  rapidly  over  time.  The  dynamics  of  social  networks 
thus  gives  worms  more  chances  to  spread  out  faster  and  wider  as  they  can  flexibly 
switch  between  existing  and  new  users  in  order  to  propagate.  Therefore,  containing 
worm  propagation  on  social  networks  is  extremely  challenging  in  the  sense  that  a  good 
solution  at  the  previous  time  step  might  not  be  sufficient  or  effective  at  the  next  time 
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Figure  6-1.  A  general  worm  containment  strategy. 

step.  Although  one  can  recompute  a  new  solution  at  each  time  the  network  changes, 
doing  so  would  result  in  heavy  computational  costs  and  be  time  consuming  as  well  as 
allowing  worms  spreading  out  wider  during  the  recomputing  process.  A  better  solution 
should  quickly  and  adaptively  update  the  current  containing  strategy  based  on  changes 
in  network  topology,  and  thus  can  avoid  the  hassle  of  recomputation. 

There  are  many  proposed  methods  for  worm  containment  on  computer  networks  by 
either  using  a  multi-resolution  approach  [97],  or  using  a  simplification  of  the  Threshold 
Random  Walk  scan  detector  [106],  or  using  fast  and  efficient  worm  signature  generation 
[51].  There  are  also  several  methods  proposed  for  cellular  and  mobile  networks  [104][7], 
However,  these  approaches  fail  to  take  into  account  the  community  structure  as  well  as 
the  dynamics  of  social  networks,  and  thus  might  not  be  appropriate  for  our  problem.  A 
recent  work  [110]  proposed  a  social-based  patching  scheme  for  worm  containment  on 
cellular  networks.  However,  this  method  encounters  the  following  limitations  on  a  real 
social  network  (1)  its  clustered  partitions  do  not  necessarily  reflect  the  natural  network 
communities,  (2)  it  requires  the  number  of  clusters  k  (which  is  generally  unknown  for 
social  networks)  must  be  specified  beforehand,  and  (3)  it  exposes  weaknesses  when 
dealing  with  the  network’s  dynamics. 
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6.1  An  Application  of  QCA  in  Containing  Worms  in  OSNs 


6.1.1  Setup 

To  overcome  these  limitations,  our  approach  first  utilizes  QCA  to  identify  the 
network  community  structure,  and  adaptively  keeps  this  structure  updated  as  the 
network  evolves.  Once  network  communities  are  detected,  our  patch  distribution 
procedure  will  select  the  most  influential  users  from  different  communities  in  order 
to  send  patches.  These  users,  as  soon  as  they  receive  patches,  will  apply  them  to 
first  disinfect  the  worm  and  then  redistribute  them  to  all  friends  in  their  communities. 
These  actions  will  contain  worm  propagation  to  only  some  communities  and  prevent  it 
from  spreading  out  to  a  larger  population.  To  this  end,  a  quick  and  precise  community 
detection  method  will  definitely  help  the  network  administrator  to  select  a  more  sufficient 
set  of  critical  users  to  send  patches,  thus  lowering  down  the  number  of  sent  patches  as 
well  as  overhead  information  over  the  social  network. 

Algorithm  15  Patch  Distribution 

Input:  G  =  (V,  E)  and  its  community  structure  C  =  {C1:  C2, Cp} 

Output:  The  set  of  influential  users  V. 

i:  P  =  0; 

2:  for  C,  G  C  do 

3:  while  3 u  unvisited  in  C,  satisfying  ma xueC/{e^t(ty)}  >  0  do 

4:  Let  v  <—  arg  ma xueCi{e^t(u)}] 

5:  V  =  V  U  v; 

6:  Mark  v  as  visited  in  C(; 

7:  end  while 

8:  end  for 

9:  Send  patches  to  users  in  V] 


We  next  describe  our  patch  distribution.  This  procedure  takes  into  account  the 
identified  network  communities  and  selects  a  set  of  influential  users  from  each 
community  in  order  to  distribute  patches.  Influential  users  of  a  community  are  ones 
having  the  most  relationships  or  connections  to  other  communities.  In  an  adversary 
point  of  view,  these  influential  users  are  potentially  vulnerable  since  they  not  only 
interact  actively  within  their  communities  but  also  with  people  outside,  and  thus,  they 
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Figure  6-2.  Infection  rates  on  static  network  with  k  =  150  clusters 

can  easily  fool  (or  be  fooled  by)  people  both  inside  and  outside  of  their  communities. 

On  the  other  point  of  view,  these  users  are  also  the  best  candidates  for  the  network 
defender  to  distribute  patches  since  they  can  easily  announce  and  forward  patches  to 
other  members  and  non-members. 

In  Alg.  15,  we  present  a  quick  algorithm  for  selecting  the  set  of  most  influential 
users  in  each  community.  This  algorithm  starts  by  picking  the  user  whose  number  of 
social  connections  to  outside  communities  is  the  highest,  and  temporarily  disregards 
this  user  from  the  considering  community.  This  process  repeats  until  no  connections 
crossing  among  communities  exists.  This  set  of  influential  users  is  the  candidate  for  the 
network  defender  for  distributing  patches. 
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Figure  6-3.  Infection  rates  on  dynamic  network  with  k  =  200  clusters 

6.1.2  Results 

We  present  the  results  of  our  QCA  method  on  the  Facebook  network  dataset 
[100]  and  compare  the  results  with  the  social  based  method  (Zhu’s  method  [110])  via  a 
weighted  version  of  our  algorithms. 

The  worm  propagation  model  in  our  experiments  mimics  the  behavior  of  the 
famous  “Koobface”  worm.  The  probabilities  of  activating  the  worm  is  proportional 
to  communication  frequency  between  the  victim  and  his  friends.  The  time  taken 
for  worms  to  spread  out  from  one  user  to  another  is  inversely  proportional  to  the 
communication  frequency  between  this  user  and  his  particular  friend.  Finally,  when  a 
worm  has  successfully  infected  a  user’s  computer,  it  will  start  propagating  as  soon  as 
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this  computer  connects  to  a  specific  social  network  (Facebook  in  this  case).  When  the 
fraction  of  infected  users  reaches  a  threshold  a,  the  detection  system  raises  an  alarm 
and  patches  will  automatically  be  sent  to  most  influential  users  selected  by  Alg.  15. 

Once  a  user  receives  the  patch,  he  will  first  apply  it  to  disinfect  the  worm  and  then  will 
have  an  option  to  forward  it  to  all  friends  in  his  community.  Each  experiment  is  seeded 
with  0.02%  of  users  to  be  initially  infected  by  worms. 

We  compare  infection  rates  of  the  social-based  method  of  Zhu’s  and  ours.  The 
infection  rate  is  computed  as  the  fraction  of  the  remaining  infected  users  over  all  infected 
ones.  The  number  of  clusters  k  in  Zhu’s  method  is  set  to  be  150  in  static  and  200  in 
dynamic  networks,  and  for  each  value  of  k,  the  alarming  threshold  a  is  set  to  be  2%, 
10%,  and  20%,  respectively.  Each  experiment  is  repeated  1000  times  for  consistency. 

Figure  6-2,  6-3  show  the  results  of  our  experiments  for  three  different  values  of 
k  and  a.  We  first  observe  that  the  longer  we  wait  (the  higher  the  alarm  threshold  is), 
the  higher  number  of  users  we  need  to  send  patches  to  in  order  to  achieve  the  desired 
infection  rate.  For  example,  with  k  =  150  clusters  and  an  expected  infection  rate  of  0.3, 
we  need  to  send  patches  to  less  than  10%  number  of  users  when  a  =  2%,  to  more  than 
15%  number  of  users  when  a  =  10%  and  to  nearly  90%  of  total  influential  users  when 
a  =  20%. 

A  second  observation  reveals  that  our  approach  achieves  better  infection  rates  than 
the  social-based  method  of  Zhu’s  in  a  static  version  of  the  social  network  as  depicted 
in  Figure  6-2.  In  particular,  the  infection  rates  obtained  in  our  method  are  from  5%  to 
10%  better  than  those  of  Zhu’s.  When  the  network  evolves  as  new  users  join  in  and  new 
social  relationships  are  introduced,  we  resize  the  number  of  cluster  k  and  recompute  the 
infection  rates  of  the  social  based  method  with  the  number  of  cluster  k  =  200,  and  the 
alarm  threshold  a  =  2%  and  10%  respectively.  As  depicted  in  Figures  6-3,  our  method, 
with  the  power  of  quickly  and  adaptively  updating  the  network  community  structure, 
achieves  better  infection  rates  than  Zhu’s  method  while  the  computational  costs  and 
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running  time  is  significantly  reduced.  As  discussed,  detecting  and  updating  the  network 
community  is  the  crucial  part  of  a  social  based  patching  scheme:  a  good  and  up-to-date 
network  community  structure  will  provide  the  network  defender  a  tighter  set  of  vulnerable 
users,  and  thus,  will  help  to  achieve  lower  infection  rates.  Our  adaptive  algorithm, 
instead  of  recomputing  the  network  structure  every  time  changes  are  introduced,  quickly 
and  adaptively  updates  the  network  communities  on-the-fly.  Thanks  to  this  frequently 
updated  community  structure,  our  patch  distribution  procedure  is  able  to  select  a  better 
set  of  influential  users,  and  thus,  helps  in  reducing  the  number  of  infected  users. 

We  further  look  more  into  the  behavior  of  Zhu’s  method  when  the  number  of 
clusters  k  varies.  We  compute  and  compare  the  infection  rates  on  Facebook  dataset 
for  various  k  ranging  from  1 K  to  2.5K  with  our  approach.  We  first  hope  that  the  more 
predefined  clusters,  the  better  infection  rates  clustered  partitioning  method  will  achieve. 
However,  the  experimental  results  reveal  the  opposite.  In  particular,  with  a  fixed  alarming 
threshold  a  =  10%  and  60%  patched  nodes,  the  infection  rates  achived  by  Zhu’s  method 
do  not  decrease  but  ranging  near  28%  while  ours  are  far  better  (20%)  with  much  less 
computational  time. 

Finally,  a  comparison  on  running  time  on  the  two  approaches  shows  that  time  taken 
for  Zhu’s  method  is  much  more  than  our  community  updating  procedure,  and  hence, 
may  prevent  this  method  to  complete  in  a  timely  manner.  In  particular,  our  approach 
takes  only  3  seconds  for  obtaining  the  basic  community  structure  and  at  most  30 
seconds  to  complete  all  the  tasks  whereas  [110]  requires  more  than  5  minutes  to  divide 
the  communication  network  into  modules  and  selecting  the  vertex  separators.  In  that 
delay,  worm  propagation  may  spread  out  to  a  larger  population,  and  thus,  the  solution 
may  not  be  effective.  These  experimental  results  confirm  the  robustness  and  efficiency 
of  our  approach  on  social  networks. 
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6.2  Containing  Worms  with  Overlapping  Communities  Detected  by  AFOCS 

We  show  another  application  of  AFOCS  in  worm  containment  problem  on  OSNs. 
OSNs  are  good  places  for  people  to  socialize  online  or  to  stay  in  touch  with  friends  and 
colleagues.  However,  when  some  of  the  users  are  infected  with  malicious  software,  such 
as  viruses  or  worms,  OSNs  are  also  fertile  grounds  for  their  rapid  propagations.  Since 
mobile  devices  are  able  to  access  online  social  applications  nowadays,  worms  and 
viruses  now  can  target  computers  [81]  and  mobile  devices  [110]. 

Recently,  community  structure-based  methods  have  been  proven  to  be  effective 
solutions  to  prevent  worms  from  spreading  out  wider  on  not  only  social  networks  [81  ][82] 
but  also  cellular  networks  [110].  Due  to  the  high  and  low  frequencies  of  interactions 
inside  and  between  communities,  worms  spread  out  quicker  within  a  community  than 
between  communities.  Therefore,  an  appropriate  reaction  should  first  contain  worms  into 
only  infected  communities,  and  then  prevent  them  from  getting  outside.  This  strategy 
can  be  accomplished  by  patching  the  most  influential  members  who  are  well-connected 
not  only  to  members  of  their  community  but  also  to  people  in  other  communities. 

6.2.1  Setup 

In  our  experiments,  we  use  Facebook  network  dataset  collected  in  [100].  This  data 
set  contains  friendship  information  and  wall  posts  among  New  Orleans  regional  network, 
spanning  from  Sep  2006  to  Jan  2009.  The  data  set  contains  more  than  63. 7K  nodes 
(users)  connected  by  more  than  1.5  million  friendship  links.  We  keep  other  parameters 
as  well  as  the  “Koobface”  worm  propagation  model  the  same  as  [82]  for  comparison 
convenience.  With  the  advantages  of  knowledge  overlapping  communities,  we  are  able 
to  develop  a  better  and  more  efficient  patching  scheme.  In  particular,  we  enhance  the 
patching  scheme  presented  in  in  [82]  to  take  the  advantage  of  the  overlap  regions: 
nodes  in  the  boundary  of  overlapped  regions  are  selected  for  patching  (Figure  6-4A). 

Alg  16  details  the  adjusted  scheme. 
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Figure  6-4.  OverCom  patching  scheme. 


Algorithm  16  OverCom  Patching  Scheme 

Input:  G  =  (V,  E)  and  C  =  {G ,  C2 _ Ck\  detected  by  AFOCS 

Output:  A  set  of  patched  nodes  IS. 

1:  IS  ^0; 

2:  for  (C/,  Cj  G  C)  do 
3:  if  ( Q  n  Cj  /  0)  then 

4:  %Choose  the  neighbors  of  overlapped  nodes  as  influential  ones% 

5:  IS  <—  IS  U  N(u)  Wu  G  Cj  D  C,; 

6:  end  if 

7:  end  for 

8:  %Patch  distribution  procedure% 

9:  for  (u  e  IS)  do 
10:  Send  patches  to  u\ 

ii:  Let  u  redistribute  patches  to  w  e  IS\N(u ); 

12:  end  for 


6.2.2  Results 

We  compare  the  OverCom  patching  scheme  and  overlapping  communities  found 
by  AFOCS  to  those  using  disjoint  communities  proposed  by  Blondel  et  al.  [6],  QCA  by 
Nguyen  et  al.  [81]  and  Clustering  based  method  suggested  by  Zhu  et  al.  [110].  The 
number  of  patched  nodes  is  shown  in  Figure  6-4B.  Both  the  number  of  patched  nodes 
and  the  infection  rates  decline  remarkably.  In  particular,  the  number  of  nodes  to  send 
patch  in  AFOCS  is  substantially  smaller  by  half  of  those  required  by  Blondel,  QCA  as 
well  as  Zhu’s  methods:  only  1725  nodes  over  63K  nodes  in  the  networks  are  needed 
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Figure  6-5.  Infection  rates  between  four  methods. 
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to  be  patched  by  OverCom  patching  scheme,  while  the  other  schemes  require  nearly 
twice  (>3,300  nodes).  The  reason  behind  this  improvement  is  due  to  the  nature  of  our 
AFOCS  framework,  the  neighbors  of  the  overlapped  nodes  should  not  be  to  far  away 
from  the  center  of  each  community,  thus  they  can  easily  redistribute  the  patches  once 
received. 

We  next  present  the  achieved  infection  rates  with  alarming  thresholds  (the  fraction 
of  infected  nodes  over  all  nodes)  a  =  2%,  10%  and  20%,  respectively.  This  threshold 
alarms  the  distribution  process  as  soon  as  the  infected  rate  goes  beyond  a.  The  results 
are  reported  in  Figures  6-5A,  6-5B,  6-5C,  respectively.  In  general,  the  higher  a  (i.e.,  the 
longer  we  wait),  the  more  nodes  we  have  to  send  patches  and  the  higher  infection  rate. 
OverCom  with  AFOCS  achieves  the  lowest  infection  rates  in  almost  all  the  experiments 
and  just  a  little  bit  lag  behind  when  a  =  10%.  In  particular,  when  a  =  2%,  AFOCS  helps 
OverCom  to  remarkably  reduce  from  1 ,6x  up  to  4.3x  the  infection  rates  of  QCA,  from 
2.6x  up  to  4x  the  infection  rates  of  Blondel  and  3.2x  to  7x  those  of  Zhu’s  method.  When 
a  =  10%,  AFOCS  +  OverCom  achieves  average  improved  rates  of  9%  over  QCA,  5% 
over  Blondel  and  43%  over  Zhu’s  methods.  As  a  =  20%,  the  average  improvements  are 
12%,  23%  and  53%,  respectively.  Due  to  the  nature  of  the  event  handling  processes, 
the  neighbors  of  overlapped  nodes  are  not  located  far  away  from  the  rest  of  their 
communities.  As  a  result,  they  can  help  to  distribute  patches  to  more  users  in  the 
communities,  hence  help  to  lower  the  infection  rates  of  AFOCS.  These  improvement 
factors,  again,  confirm  the  effectiveness  of  our  proposed  method. 
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CHAPTER  7 

STABLE  COMMUNITY  DETECTION  IN  ONLINE  SOCIAL  NETWORKS 

A  large  body  of  work  has  been  devoted  to  find  general  communities  (i.e.,  without  the 
concept  of  stability)  on  both  directed  and  undirected  networks  in  the  literature  [32],  On 
the  contrary,  only  a  very  few  approaches  are  suggested  to  identify  stable  communities 
[59][68],  especially  on  directed  and  weighted  networks.  The  main  source  of  difficulty 
is  due  to  the  inconsistency  of  community  members  in  a  general  structure:  while  they 
might  appear  to  be  in  a  community  at  one  time,  they  may  not  commit  to  that  particular 
community  in  a  long  run.  One  possible  approach,  therefore,  is  to  find  a  consensus  of 
a  specific  algorithm  after  multiple  runs  and  use  this  core  as  stable  communities  [59]. 
However,  doing  in  this  way  would  result  in  expensive  computational  cost  and  time 
consuming  as  well  as  lack  of  convergence  guarantees.  In  [68],  the  authors  estimate  the 
mutual  links  between  pairs  of  users  and  suggest  a  detection  method  that  optimizes  the 
total  mutual  connection  on  the  whole  network.  While  the  idea  of  mutual  connection  is 
quite  interesting,  we  find  that  it  might  not  be  sufficient  because  some  estimated  mutual 
links  are  of  low  magnitudes,  and  thus,  may  not  reflect  the  correct  concept  of  stability  at 
the  community  level. 

In  general,  a  stable  community  is  often  characterized  either  by  its  tight  and  strong 
internal  relationships  represented  by  the  mutual  connections  among  its  users  [68], 
or  by  its  internal  links  who  possess  a  high  tendency  to  remain  within  the  community 
over  a  long  period  of  time  [23].  In  other  words,  stable  communities  in  the  network 
are  commonly  characterized  by  stable  connections  among  their  members.  Motivated 
by  these  observations,  we  suggest  SCD  (short  for  Stable  Community  Detection),  a 
framework  to  effectively  identify  stable  communities  in  directed  OSNs  that  facilitates  both 
of  the  above  intuitions.  In  a  big  picture,  SCD  works  by  first  enriching  the  input  network 
with  the  stability  estimation  of  all  links  in  the  network,  and  then  discovering  communities 
via  stable  connections  using  the  lumped  Markov  chain  model.  Our  approach  is 
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mathematically  supported  by  a  key  connection  between  the  persistence  probability  of  a 
community  at  the  stationary  distribution  and  its  local  topology.  One  notable  advantage  of 
SCD  is  that  it  requires  only  a  single  iteration,  which  shall  significantly  reduce  the  running 
time.  Furthermore,  since  our  method  intrinsically  accounts  for  stability,  the  discovered 
communities  should  be  stable  as  opposed  to  doing  a  statistical  analysis. 

In  summary,  we  suggest  an  estimation  which  provides  helpful  insights  into  the 
stability  of  links  in  the  input  network.  Based  on  that,  we  propose  SCD  -  a  framework 
to  identify  community  structure  in  directional  OSNs  with  the  advantage  of  community 
stability.  We  next  explore  an  essential  connection  between  the  persistence  probability 
of  a  community  at  the  stationary  distribution  and  its  local  topology,  which  is  the 
fundamental  mathematical  theory  to  support  the  SCD  framework.  To  certify  the 
efficiency  of  our  approach,  we  extensively  test  SCD  on  both  synthesized  datasets 
with  embedded  communities  and  real-world  social  traces,  including  NetHEPT  and 
NetHEPT.WC  collaboration  networks  as  well  as  Facebook  social  networks,  in  reference 
to  the  consensus  of  other  state-of-the-art  detection  methods.  Highly  competitive 
empirical  results  confirm  the  quality  and  efficiency  of  SCD  on  identifying  stable 
communities  in  OSNs. 

7.1  Basic  Notations 

We  introduce  the  basic  notations  representing  the  underlying  social  network  that  we 
will  use  throughout  this  paper. 

(Graph  notation)  Let  G  =  (V,  E,w)  be  a  directed  and  weighted  graph  representing 
a  social  network  with  V  is  the  set  of  n  network  users  (or  nodes),  E  is  the  set  of  m 
directed  relationships  (or  edges),  and  w  (or  precisely  wuv)  is  the  weight  function  on 
each  edge  (u,  v)  e  E  representing  the  communication  frequency  between  user  u  and 
v  in  the  social  network.  Without  loss  of  generality,  we  assume  that  all  edge  weights 
are  normalized,  i.e.,  V)ge  wuv  =  1  and  wuv  >  0.  For  each  edge  (u,  v)  e  E  which 
(v,  u )  E,  we  say  that  the  backwards  edge  (v,  u)  is  missing,  we  will  use  the  notation 
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(y;  u )  and  i/i/„;u  to  denote  the  mutual  link  of  edge  ( u ,  v)  if  (V,  t/)  should  indeed  exist  in  E 
and  its  weight,  respectively.  Furthermore,  we  will  use  the  notation  st(u,  v,  t )  to  denote 
the  stability  estimate  of  edge  ( u ,  v)  at  time  step  (or  hop)  t.  These  notations  will  be 
described  in  detail  in  next  section. 

(Community  notation)  Denote  by  C  =  {Q,  C2 _ Q}  the  network  community 

structure,  i.e.,  a  collection  of  q  subsets  of  V  satisfying  U(q=1C,  =  V  and  Q  n  Q  =  0  V/',y. 
We  say  that  each  C,  e  C  and  its  induced  subgraph  form  a  community  of  G.  For  a  node 
u  e  V,  let  /V+,  N~  and  Nu  denote  the  set  of  outgoing,  the  set  of  incoming,  and  the  set 
of  all  neighbor  nodes  adjacent  to  u,  respectively.  Furthermore,  let  /c+(or  w+),  k~( or  w~) 
and  ku( or  wu)  be  the  corresponding  cardinalities  (or  total  weights)  of  these  sets.  For  any 
C  c  V,  let  Cn  and  Cout  denote  the  set  of  links  having  both  endpoints  in  C  and  the  set 
of  links  heading  out  from  C,  respectively.  In  addition,  let  mc  =  \Cin\  (rsp.  wc  =  w(C'n )) 
and  k+  =  ku  (rsP-  =  Y,u&c  wu)-  Finally,  the  terms  node-vertex  as  well  as 
edge-link-connection  are  used  interchangeably. 

7.2  Link  Stability  Estimation 

We  describe  our  first  step  towards  the  identification  of  stable  communities  in  the 
network:  the  link  stability  estimation  process.  Intuitively,  a  stable  community  is  often 
characterized  either  by  its  tight  and  strong  internal  relationships  represented  by  the 
mutual  connections  among  its  users  [68],  or  by  its  internal  links  who  possess  a  high 
tendency  to  remain  within  the  community  over  a  long  period  of  time  [23].  In  other  words, 
stable  communities  in  the  network  are  commonly  characterized  by  stable  connections 
among  their  members.  Motivated  by  these  observations,  in  this  section,  we  suggest  a 
procedure  for  estimating  the  stability  of  each  link  in  the  network  that  facilitates  both  of 
the  above  intuitions.  Our  estimation  procedure  consists  of  two  stages:  In  the  early  stage, 
the  reciprocity  of  each  link  in  the  network  is  first  predicted,  and  based  on  that,  its  stability 
is  consequently  evaluated  in  the  later  stage. 
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7.2.1  Link  reciprocity  prediction 

When  dealing  with  large  scale  OSNs,  it  is  possible  that  some  backwards  edges 
between  individuals  are  missing.  This  lack  of  information  may  due  to  the  imperfect 
data  collection  process,  or  because  these  backwards  edges  are  not  yet  reflected 
in  the  underlying  network  but  should  due  to  the  strong  relationships  between  local 
network  users.  For  instance,  Leskovec  et  al.  [65]  observe  that  friends  of  friends  in  social 
networks  tend  to  be  friend  of  each  other  in  the  near  future,  i.e.,  there  should  be  dual 
connections  between  friends  of  friends  with  high  chance  even  if  they  are  not  yet  friend 
of  each  other.  Therefore,  predicting  the  existence  of  these  backwards  edges  will  allow 
a  more  complete  and  comprehensive  detection  of  stable  communities  by  increasing  the 
internal  density  of  strongly  connected  components,  which  are  potential  candidates  for 
network  communities. 

Link  reciprocity  prediction  problem  is  a  well-studied  field  and  many  methods 
are  proposed  in  the  literature  [69][3][30].  In  this  paper,  we  utilize  a  method  called 
“friends-measure”  suggested  in  [30].  The  intuition  behind  this  measure  is  that  when 
looking  at  two  users  in  the  social  network,  one  can  assume  that  the  more  connections 
their  neighbors  have  with  each  other,  the  higher  the  chance  the  two  users  are  actually 
connected.  Originally,  this  friends-measure  between  two  users  u  and  v  is  formulated  as: 

friends-measure(r/,  v)  =  ^  ^  <5(x,y) 

xGA/„  y£Nv 

where  S(x,y)  =  1  if  either  x  =  yor(x,y)  g  Eor(y,x)  g  E.  This  measure  has 
been  extensively  verified  among  other  topological  features  and  has  been  shown  to  be 
a  promising  one  in  comparison  with  other  metrics  [30].  However,  in  the  case  of  directed 
networks,  there  are  possibilities  that  different  link  topologies  can  share  a  common 
friends-measure  value.  Therefore,  we  need  to  modify  the  above  formula  so  that  it 
reflects  the  true  relationship  between  the  network  users,  and  furthermore  copes  with 
edge  weights  in  the  network. 
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In  order  to  better  handle  directed  and  weighted  graphs,  we  will  attempt  to  predict 
the  existence  of  backwards  edges  of  unidirectional  links.  For  example,  if  (u,  v)  e  E  and 
(v,  u)  <£  E  ,  we  will  try  to  find  the  possibility  whether  we  should  enrich  the  network  by 
inserting  (v,  u)  into  E.  To  this  end,  we  first  relax  the  direction  of  the  edge  between  u  and 
i/,  and  next  compute  the  likelihood  that  a  backwards  edge  should  exist  between  u  and  1/ 
by  using  the  modified  formula 

A(  v )  =  Exew,  T(x’  y)  (7_1 J 

WyWu 

Where  r(x,  y)  =  wvxwxywyu  is  the  total  possibility  of  the  backwards  path  starting 
from  v,  passing  through  neighbor  nodes  x  and  y,  and  ending  at  u.  When  the  network 
is  unweighted  Wu  =  du,  Wv  =  dv  and  thus,  A (u,  1/)  counts  the  (normalized)  number  of 
paths  of  lengths  two  and  three  joining  two  users  u  and  1/,  which  intuitively  agrees  with 
the  aforementioned  friends-measure  formula.  By  Proposition  7.1 ,  we  show  that  A (u,  v) 
is  indeed  the  generalization  of  weighted  friend-measure(t/,  v)  and  depends  only  on  the 
nodes’  topology.  Hence,  A (u,  v)  can  be  regarded  as  the  estimated  probability  that  the 
backwards  connection  <  v,  u  >  indeed  exists,  i.e.,  we  set  w<vu>  =  A (u,  v). 

Proposition  7.1.  For  any  (u,  v)  e  E  which  (v,  u)  E,  0  <  A  (u,v)  <  1. 

Proof.  We  first  prove  this  for  unweighted  graphs.  The  proof  for  weighted  graphs 
can  be  extended  straightforwardly.  It  is  obvious  that  0  <  A (u,  v ).  Now  we  show 
A (u,  v )  <  1.  For  any  x,  y  such  that  5(x,  y)  =  1,  if  x  =  y,  they  can  make  just  one 
connection  counted  towards  the  summation.  Otherwise,  they  can  make  at  most  du 
(or  dv)  dual-connections  at  each  vertex.  Taking  these  facts  into  account,  we  have 
A(°.  v)  =  Exe/w  £yeWu  5(x'  y)  <  dvdu.  Thus,  the  inequalities  follows.  The  left  equality 
holds  when  there  are  no  connections  from  u  to  v  and  vice  versa.  The  right  equality 
holds  when  every  path  of  length  2  from  u  to  v  (or  from  v  to  u)  are  contained  in  the 
corresponding  path  of  length  3.  □ 
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Proposition  7.2.  Let  n0  be  the  number  of  unidirectional  links  in  the  input  network.  The 
time  complexity  for  estimating  the  mutual  connections  for  these  links  is  O(n0M). 

Proof.  The  total  time  required  for  estimating  the  possibility  for  a  backward  connection  at 
an  edge  (u,  v)  is  du  +  dv  +  min  J]xew+  I]yew-{dxh,  d Thus,  for  all  n0  links,  the  total  time 
complexity  is  upper  bounded  by  n0(2M)  +  n0M  =  O(n0M).  □ 

7.2.2  Link  stability  estimation 

After  the  reciprocity  of  each  link  in  the  network  has  been  estimated,  the  input 
network  is  now  enriched  with  more  information  of  the  backwards  edges.  While  the 
presence  of  these  dual  edges  is  helpful  in  characterizing  the  mutual  relationships 
between  pairs  of  network  users,  it  might  not  be  sufficient  to  evaluate  the  stability  of 
all  network  connections  as  some  of  the  backwards  edges  may  be  of  low  magnitudes, 
and  thus,  may  not  be  able  to  hint  the  stability  of  the  connection.  Therefore,  we  need  to 
further  estimate  the  stability  of  a  network  link  given  its  predicted  reciprocity.  In  order  to 
do  so,  we  define  the  stability  of  an  edge  (u,  v)  e  E  at  t  time  steps  (or  t  hops)  as  follow 

st(u,  v,  t )  =  E  w(p) 

\p\=t 

where  P  is  a  path  going  from  v  to  u  (v  and  u  are  excluded)  of  length  \P\  =  t,  and 
«(P)  =  n,a.  b)(zP  wab  is  the  total  weight  of  path  P.  Finally,  we  define  the  stability  st(u,  v) 
of  a  link  ( u ,  v)  e  E  as  the  total  stability  of  up  to  T0  time  steps,  where  T0  is  a  predefined 
parameter  (or  the  upper  bound  on  the  number  of  hops) 

To 

st(u,  v)  =  J2st(u’ v’ f)-  (7-2) 

t=l 

The  intuition  behind  our  stability  function  st(u,  v)  is  as  follow:  since  stable  communities 
are  commonly  recognized  by  a  high  density  of  stable  edges,  it  is  reasonable  to  expect 
that  such  edges  form  a  cycles.  In  the  senses  of  directed  and  weighted  networks,  the 
stronger  the  strength  of  cycles  an  edge  (u,  v)  is  on,  the  more  stable  it  is  believed  to  be. 
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On  the  contrary,  edges  that  connecting  or  joining  between  communities  shall  hardly 
be  part  of  many  cycles,  and  eventually  result  in  low  stability.  Figure  7-1  illustrates  the 
stability  estimates  for  link  (u,  v)  at  0,  1  and  2  hops:  a)  st(u,  v,  0)  =  0.45  =  w<vu> ,  b) 
st(u,  v,  1)  =  0.5  x  0.2  =  0.1,  c)  st(u,  v,  2)  =  0.5  x  0.1  x  0.2  =  0.05. 

As  a  local  measure,  our  suggested  stability  function  has  the  following  advantages 
(1)  it  puts  more  focus  on  the  existence  of  the  mutual  link  of  any  link  (u,  v)  by  reserving 
the  original  strength  of  the  backwards  edge  <  v,  u  >.  This  intuitively  agrees  with  the 
findings  that  stable  clusters  are  usually  made  of  bidirectional  links  in  [68].  Moreover, 
our  formula  further  takes  into  account  the  strength  of  cycles  containing  the  current 
link;  (2)  the  more  time  (or,  number  of  hops)  we  allow,  the  more  stability  a  link  would 
be.  Nevertheless,  links  that  really  belong  to  a  stable  community  are  more  likely  to 
have  strong  stability  whereas  those  connecting  communities  are  of  low  stability.  These 
advantages  support  the  intuitions  of  stable  communities  that  we  discussed  above.  The 
performance  of  our  stability  estimation  is  evaluated  in  more  detail  in  section  7.4. 

In  summary,  our  link  stability  estimation  first  predicts  the  potential  of  the  dual  link 
of  any  link  ( u ,  v)  e  E  such  that  (v,  u )  E  by  using  the  modified  measure  in  equation 
(7-1).  Next,  it  evaluates  the  stability  of  the  every  link  in  the  given  network  enriched  from 
the  first  stage  by  using  equation  (7-2),  and  utilizes  these  stability  values  as  new  weights 
for  links  in  the  network.  This  resulting  network  will  be  consequently  passed  as  the  input 
network  to  our  main  process:  the  identification  of  stable  communities. 
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7.3  Stable  Community  Detection 

In  this  section,  we  present  our  main  contribution:  the  stable  community  identification 
process.  Given  the  input  network  enriched  with  link  stability  information,  we  discover 
the  stable  communities  by  exploring  an  important  connection  between  the  persistence 
probability  of  each  community  and  its  local  network  topology.  In  the  following  paragraphs, 
we  first  review  the  concept  of  Lumped  Markov  chain  [50][89],  and  then  establish  our  key 
connection  between  this  Markov  chain  and  the  local  network  topology.  Finally,  we 
describe  in  detail  our  last  but  most  important  process:  stable  community  detection. 

7.3.1  Lumped  Markov  chain 

A  Markov  chain  [96]  is  a  mathematical  system  representing  transitions  from  one 
system’s  state  to  another,  between  a  finite  number  of  predefined  states.  In  terms  of 
social  networks,  a  state  can  be  either  a  user  (a  node  in  the  graph)  or  a  group  of  tightly 
connected  users  (a  community)  in  the  networks,  whereas  transitions  can  be  regarded 
as  the  user-to-user  or  group-to-group  communication  tendencies.  An  /7-state  Markov 
chain  corresponding  to  an  n- node  network  is  commonly  represented  by  the  transition 

7rt+1  =  7 rtP,  where  nt  =  (n1:t,  7r2,t _ 7r n,t)  with  7r Uit  is  the  probability  of  being  at  node  u  at 

time  t,  and  P  =  (puv)  is  the  transition  matrix.  In  particular,  this  /7-state  Markov  chain  can 
be  associated  to  input  network  by  letting  the  probability  of  transiting  from  a  node  u  to  a 
neighbor  node  v  as 

_  wuv  _  wuv 

Puv  v — '  _T_  ■ 

Ej  Wuj  K 

Basically,  puv  is  the  probability  of  a  random  walker  jumps  from  node  u  to  node  v  given 
the  network  topology.  A  Markov  chain  is  said  to  be  at  its  stationary  state  distribution 
7T  if  7T  satisfies  the  equation  n  =  nP.  As  shown  in  [43],  when  the  network  is  originally 
connected  P  would  be  irreducible,  and  thus,  the  equation  n  =  nP  has  a  unique  solution 
which  is  strictly  positive  (ttu  >  0  Vue  V)  which  corresponds  to  the  stationary  Markov 
chain  state  distribution.  When  the  network  is  undirected,  7r  can  be  exactly  computed  as 
7T  =  2W~q{wi,  1/1/2 _ wn)  with  W0  is  the  total  edge  weights.  However,  we  do  not  have  an 
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exact  form  for  the  stationary  distribution  n  in  general  for  directed  network,  and  thus,  n 
has  to  be  computed  numerically. 

As  our  ultimate  goal  is  to  detect  the  stable  network  community  structure,  we  sought 
to  find  a  good  partitioning  of  V  where  each  partition  will  remain  wealthy  over  time.  In  the 
light  of  Markovian  chain  method,  this  corresponds  to  finding  a  collection  of  communities 

C  =  {Ci,  C2 _ Cq}  where  a  random  walker  would  spend  most  of  the  time  walking  inside 

a  community  and  less  time  wandering  among  communities.  By  defining  this  partition  C 
of  g  communities,  we  introduce  a  so  called  g-state  meta-network  where  each  community 
in  the  network  becomes  a  meta-state.  However,  at  this  aggregate  level,  a  in  general 
dynamics  Markovian  description  of  a  random  walker  walking  among  communities  is  not 
possible  because  the  Markovian  property  may  not  be  well-preserved  [50].  Nevertheless, 
this  g-state  community-to-community  transition  can  still  be  defined  using  the  lumped 
Markov  chain,  which  correctly  describes  the  random  walker  at  this  scale  given  the 
stochastic  process  is  started  at  the  stationary  distribution  n  [43].  This  lumped  Markov 
chain  is  defined  via  the  g  x  g  matrix  as  U  in  [89] 

U  =  [diag^irH)]-1  HT  diag(7r)PH 

where  H  is  a  n  x  g  binary  matrix  representing  the  partitioning  C. 

One  of  the  notable  advantages  of  the  lumped  Markov  chain  nt+i  =  ntU  defined 
on  U  is  that  it  shares  the  same  stationary  distribution  with  the  original  Markov  chain, 
i.e.,  the  new  stationary  distribution  defined  by  n  =  nH  satisfies  the  equation  n  =  WJ. 
Moreover,  the  difference  between  nt+i  =  ntU,  starting  at  n0  =  nU,  and  the  original 
7 rtH  tends  exponentially  to  zero  if  the  two  chains  are  regular.  These  advantages  make 
the  community-based  lumped  Markov  chain  defined  by  ntU  a  very  good  approximation 
of  the  original  /7-node  network.  We  stress  that  the  ability  of  the  lumped  Markov  chain  to 
describe  the  random  walk  dynamics  only  at  stationary  is  not  a  limitation  for  the  detection 
of  stable  communities.  Indeed,  this  stationary  requirement  evaluates  the  random  walk 
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dynamics  of  all  nodes  at  their  stable  states,  and  hence  perfectly  supports  the  concept  of 
stable  communities. 

In  terms  of  interpretation,  each  entry  ucd  of  U  denotes  the  chance  that  a  random 
walker,  at  time  t,  wanders  from  community  c  to  another  community  d  in  time  t  +  1.  As 
a  result,  the  diagonal  elements  uCc  s  (or  uc’s  in  short)  of  U  indicate  the  persistence 
probabilities  that  a  random  walker  just  walking  within  a  particular  community  C.  Of 
course,  large  values  of  uc’s  are  expected  for  meaningful  communities.  It  is  also  shown  in 
[89]  that  in  directed  and  weighted  graphs,  uc  can  be  computed  as 


uc  — 


S /jec  niPij 


7r' 

Note  that  J2iJeC  niPu is  the  fraction  of  time  a  random  walker  spends  on  the  links  inside 
a  community  C.  Hence,  uc  is  indeed  the  ratio  between  the  amount  of  time  a  random 
walker  spends  on  links  and  that  it  spends  on  nodes  in  C.  In  undirected  networks,  one 
can  verify  that 


(7-3) 


uc  = 


Euec  n'wU 


2  wc 


J2iecw'  2  wc  +  w(Cout) 


7.3.2  Connection  to  the  network  topology 

At  this  stage,  one  might  try  to  optimize  uc  for  all  communities  C  e  C  in  order  to 
maximize  theie  persistence  probabilities.  However,  doing  in  this  way  requires  solving 
for  the  stationary  distribution  7r,’s  (as  in  equation  (7-3))  which  may  be  extremely  costly, 
especially  in  large  scale  directed  networks.  So,  how  can  we  effectively  optimize  the 
persistence  probability  uc  for  each  community  without  solving  for  that  costly  exact 
stationary  distribution?  As  an  answer  for  this  challenging  question,  we  present  in 
Proposition  7.3  a  connection  between  the  persistence  probability  of  a  community  C 
and  its  local  topology.  In  particular,  we  show  that  the  minimum  value  of  uc  can  be 
represented  by  quantities  that  only  involve  C’s  local  topology.  Therefore,  optimizing  uc 
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can  be  shifted  as  the  optimization  of  these  local  components,  which  are  inexpensive  and 
easy  to  derive. 

Proposition  7.3.  For  any  community  C  eC,  at  the  stationary  distribution  n,  we  have  the 
following  inequality 

YliJeC  77 iPij  ^  wc 

uc  =  -E - >  — ■ 

E/ec  n>  wc 

Proof.  It  is  easy  to  see  that 


uc 


^2ijec  7li PiJ 

E/ec  71 ' 


E/ec  7r' 


where  wi  C  =  JTeC  w/y.  Next,  we  rewrite  E/ec71-'  in  the  form  E ,ec7r'  =  ^ ec  where 
ec  =  (e,)Wxi  and  e,-  =  1  if  /  e  C  and  0  otherwise.  Since  tc  is  the  stationary  distribution  of 
the  Markov  chain,  we  have  n  =  nP.  Thus 


7T  ec  =  vrTPec  =  X7^  XI  “+) 

z z '  w 

ICC  i:(i.i)CE  1 


Now  we  have, 


X 7r'  x 

ice 


X>(  E  E"c 

iCC  j:(ij)CE  1 


\  -  W/  c  ( 

<  X^'^v 
z — '  1/1/ 

ice  1  tec 


X< 


)  = 


i  ,jce 


VT; 


W/,c 

l/lZ 


X  w, 


c 


Hence,  the  conclusion  follows.  The  quality  holds  when  all  7r(-  equals  to  each  other  and 
wc  =  m/+.  This  happens  when  C  is  a  full  dually  connected  clique  and  is  disconnected 
from  the  rest  of  the  network.  □ 


7.3.3  Detecting  communities 
7.3.3.1  Formulation 

Proposition  7.3  discussed  in  the  above  paragraph  establishes  the  connection 
between  the  persistence  probability  of  a  random  walker  staying  within  a  community  C 
and  the  local  network  topology.  As  a  result,  if  we  can  maximize  the  later  quantity,  we  can 
provide  some  insurance  to  the  desired  optimization  with  high  confidence.  Taking  into 
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account  this  intuition,  we  propose  Stable  Community  Detection  (SCD)  as  an  optimization 
problem  defined  as  follow:  Given  a  directed,  weighted  network  G  =  (V,  E,  w),  find 

a  community  structure  C  =  {Ci,  C2 _ Cq}  such  that  the  overall  total  persistence 

probability  is  maximized: 


max 


cec 


\Nq_ 

wc 


subject  to 


c,  n  Cj 


V/,7  e  {1,2, 


\JCi  =  v 


Note  that  in  our  SCD  formulation,  the  number  of  communities  q  will  be  determined  by 
optimizing  the  objective  function  7 Z  and  is  not  an  input  parameter.  Indeed,  optimizing 
7 Z  provides  us  q  a  very  good  estimate  for  the  actual  number  of  communities,  as  we  will 
show  in  section  7.4. 

7.3.3.2  Resolution  limit  analysis 

Perhaps  one  of  the  most  important  properties  a  metric  suggested  for  identifying 
community  structure  should  satisfy  is  the  ability  of  overcoming  the  resolution  limit  [35], 
i.e.,  the  metric  should  be  able  to  detect  network  communities  even  at  different  scaling 
levels.  In  this  subsection,  we  analyze  the  resistance  to  resolution  limit  of  our  proposed 
function  7 Z  by  looking  particularly  at  the  condition  in  which  two  communities  should  be 
merged  together.  In  what  following,  we  simplify  the  situation  by  considering  undirected 
networks. 

Let  us  consider  two  communities  Ci  and  C2.  Let  m12  be  the  number  of  edges 
connecting  Cx  and  C2.  In  order  to  merge  Cx  and  C2  into  a  bigger  community,  m12  should 
satisfy: 


mCi  mc2 

I  ,_j_ 


< 


Jc  i 


JC2 


mCl  +  mC2  +  m  12 
dt  +  d+ 
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The  above  condition  is  equivalent  to: 


mC i 


+ 


<  m12 


which  in  turn  implies  'lyjmClmCl  <  m12.  Without  loss  of  generality,  we  can  assume  that 
mCl  <  mC2,  thus  2 mCl  <  m12.  This  violates  the  condition  of  even  a  weak  community. 
Moreover,  this  inequality  implies  the  sufficient  condition  to  merge  two  adjacent 
communities  depends  on  the  local  structure  of  two  communities  only,  regardless  of 
the  rest  of  the  network.  This  observation  indicates  that  our  proposed  metric  7 Z  is  strongly 
against  the  resolution  limit. 

7.3.3.3  Connection  to  stability  estimation 

We  next  verify  the  following  properties  of  network  communities  identified  by 
optimizing  our  suggested  metric  1Z\  (1)  links  within  a  communities  are  of  high  stability 
and  (2)  links  connecting  communities  are  of  low  stability  values.  These  two  observations 
are  shown  in  Propositon  7.4. 

Proposition  7.4.  LetC  =  {C1:  C2 _ Ck}  be  a  community  structure  detected  by  optimiz¬ 

ing  7 Z,  links  within  each  C,  are  of  strong  stability  and  those  connecting  communities  are 
of  weak  stability  values. 


Proof.  For  any  node  p  e  V  and  subset  A  c  v,  let  wpA  be  the  total  weight  of  all  links  that 
p  has  towards  A  and  vice  versa.  By  this  definition,  we  obtain  wp  =  wpA  +  wpM\A.  For  any 
community  C  eC,  s  e  C  and  p  C,  since  p  is  not  a  member  of  C,  we  have 

WC  WC  +  wp,C  _  WC  +  wp.C 

WC  WC  +  WP  WC  +  WP.C  +  WP.V\C  ’ 

because  otherwise  joining  p  to  C  will  give  a  better  value  of  7 Z.  This  equality  equals 


wpX  wc_ 

wp  w£  ’ 

which  in  turn  implies  that  the  stability  contribution  of  links  joining  p  to  C  are  insignificant 
in  comparison  to  C  as  a  whole. 
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Similarly,  for  any  node  s  e  C,  we  have 

M/C  WC  -  WP'C  _  WC  ~  Wp_c 

w£  w£  -  Wp  w£  -  Wp  c  -  Wp,V\C  ' 

because  otherwise  excluding  s  from  C  will  give  a  better  7 Z.  This  inequality  equals  to 

WS|C  >  Wc 
ws  ’ 

which  in  turn  implies  that  the  stability  contribution  of  internal  links  of  C  are  significant  in 
comparison  to  C  as  a  whole.  □ 

7.3. 3. 4  A  greedy  algorithm  for  SCD  problem 

Analyzing  the  theoretical  hardness  of  the  SCD  problem  is  an  aspect  beyond  the 
scope  of  this  paper.  In  fact,  the  NP-hardness  of  the  SCD  problem  can  be  shown  by  a 
similar  reduction  to  MODULARITY  as  in  [8]  (see  also  [101]  and  [36]  for  a  comprehensive 
survey  on  similar  graph  clustering  problems).  Given  its  NP-hardness,  a  heuristic 
approach  that  can  provide  a  good  solution  in  a  timely  manner  is  therefore  more 
desirable.  In  this  section,  we  describe  a  greedy  algorithm  for  the  SCD  problem 
consisting  of  community  growing,  strengthening  and  refinement  phases  described 
as  follow. 

Growing  phase.  This  phase  is  responsible  for  discovering  raw  communities  in  the 
input  network.  Initially,  all  nodes  are  unassigned  and  do  not  belong  to  any  community. 

Next,  a  random  node  is  selected  as  the  first  member  (or  the  seed)  of  a  new  community 
C,  and  consequently,  new  members  who  help  to  maximize  C’s  persistence  probability 
are  gradually  admitted  into  C.  When  there  is  no  more  node  that  can  improve  this 
objective  of  the  current  community,  another  new  community  is  formed  and  the  whole 
process  is  then  cycled  in  the  very  same  manner  on  this  newly  formed  community. 

Strengthening  phase.  We  further  rearrange  nodes  into  more  appropriate  communities. 
Since  new  members  are  admitted  into  a  community  C  in  a  random  order,  C’s  objective 
value  could  be  further  improve  with  the  absence  of  some  of  it  members  as  they  can  be 
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Algorithm  17  SCD  Algorithm 
Input:  A  directed  weighted  graph  G  =  (V,  E,w) 
Output:  Community  structure  C 

Growing  Phase: 

C  <-  0 
A  <-  V 

while  3  unassigned  node  u  e  Ado 

C  <-  {u} 

A  A  \  {u} 

while  3  v  e  A  such  that  uCU{v}  >  uc  do 
i/  «-  arg  maxveA{uCU{v}} 
c^cujv} 

A^  A\{v} 

end  while 

c^cujc} 

end  while 

Strengthening  Phase: 
for  C  e  C  do 

while  3 u  e  C  such  that  uc  <  uC\{u}  do 
C^C\{u} 

C  i —  c  u  {t/} 

end  while 
end  for 

Refining  Phase: 

while  3 Ci,  C2  such  that  uClUc2  >  ucx  +  uC2  do 
(Ci,  C2)  <-  arg  maxCliC2eC{t/Cluc2  -  uCl  ~  uC2} 
c  <-  (c\{Ci,  c2})u{CiU  c2} 

end  while 

Return  C 


obstacles  for  the  total  stability.  This  requires  the  reevaluation  of  all  C’s  members  as  a 
result.  Therefore,  in  this  phase,  we  exclude  any  node  which  reduces  the  persistence 
probability  of  a  community  and  let  them  be  singleton  communities.  The  removal  of 
such  nodes  creates  more  cohesive  communities,  i.e.,  communities  with  higher  internal 
stability. 

Refining  phase.  In  the  last  phase,  the  global  stability  of  the  whole  network 
is  reevaluated.  In  particular,  this  last  refinement  phase  looks  at  the  merging  of 
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two  adjacent  communities  in  order  to  improve  the  overall  objective  function.  If  two 
communities  have  a  great  number  of  mutual  connections  between  them,  it  is  thus  more 
stable  to  merge  them  into  one  community.  The  final  algorithm,  which  we  call  SCD 
algorithm,  is  presented  in  Alg.  17. 

7.4  Experimental  Results 

In  this  section,  we  present  our  results  on  the  discovery  of  network  communities 
on  both  synthesized  networks  with  known  groundtruths  and  real-world  social  traces 
including  NetHEPT  and  NetHEPT.WC  collaboration  and  Facebook  networks.  We 
evaluate  the  following  aspects  of  our  proposed  SCD  framework  (1)  the  effectiveness 
of  our  link  stability  estimation  process,  (2)  the  ability  of  identifying  the  general  network 
community  structure  without  the  concept  of  community  stability,  i.e.,  how  similar  our 
detected  communities  are  in  comparison  with  the  groundtruths,  and  (3)  the  ability  of 
identifying  stable  communities  in  reference  to  the  consensus  of  other  state-of-the-art 
methods,  including  Blondel’s  [6],  Infomap  [93]  and  OSLOM  [61]  methods,  after  their 
multiple  executions. 

7.4.1  Datasets 

(Synthesized  networks)  Of  course,  the  best  way  to  evaluate  our  approaches  is  to 
validate  them  on  real-world  networks  with  known  community  structures.  Unfortunately, 
we  often  do  not  know  that  structures  beforehand,  or  such  structures  cannot  be  easily 
mined  from  the  network  topologies.  Although  synthesized  networks  might  not  reflect 
all  the  statistical  properties  of  real  ones,  they  can  provide  us  the  known  groundtruths 
via  planted  communities  and  the  ability  to  vary  other  network  parameters  such  as 
sizes,  densities  and  overlapping  levels,  etc.  Testing  community  detection  methods  on 
generated  data  has  also  becomes  a  usual  practice  that  is  widely  accepted  in  the  field 
[55]. 

We  use  the  well-known  LFR  benchmark  [55]  to  generate  190  weighted  and  directed 
testbeds.  Generated  data  follow  power-law  degree  distribution  and  contain  embedded 
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A  Networks  with  minC,  maxC  unconstrained. 


B  Networks  with  minC  =  25,  maxC  =  50  (small-size). 


C  Networks  with  minC  =  50,  maxC  =  100  (big-size). 

Figure  7-2.  Results  on  synthesized  networks  with  different  community  criteria. 

communities  of  varying  sizes  that  capture  characteristics  of  real-world  networks. 
Parameters  are:  the  number  of  nodes  N  =  1000  and  5000,  the  mixing  parameter 
H  =  [0.1. ..1]  controlling  the  overall  sharpness  of  the  community  structure,  the  minimum 
(minC)  and  maximum  (maxC)  of  community  sizes  are  set  to  (25,  50)  for  small-size  and 
(50,  100)  for  big-size  communities  as  in  the  standard  settings.  Each  test  is  averaged 
over  100  runs  for  consistency. 
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(NetHEPT  and  NetHEPT.WC)  The  NetHEPT  traces  are  widely-used  datasets  for 
testing  social-aware  detection  methods  [1 1  ][1 2].  These  traces  contain  information, 
mostly  the  academic  collaboration  from  arXiv’s  “High  Energy  Physics  -  Theory”  section 
where  nodes  stand  for  authors  and  links  represent  coauthorships.  In  their  deliverable, 
the  NetHEPT  networks  contain  15233  nodes  and  31398  links,  and  weights  on  edges  are 
assigned  by  either  uniformly  at  random  (for  NetHEPT  data)  or  by  weighted  cascade  (for 
NetHEPT  WC  data)  where  wuv  =  1  /din(v)  with  din(v)  is  the  indegree  of  a  node  v. 

(Facebook)  This  dataset  contains  friendship  information  among  New  Orleans 
regional  network  on  Facebook,  spanning  from  September  2006  to  January  2009  [100]. 
The  data  contains  more  than  63K  nodes  (users)  connected  by  more  than  1 .5  million 
friendship  links  with  an  average  node  degree  of  23.5.  In  our  experiments,  the  weight 
for  each  link  between  users  u  and  v  is  proportional  to  the  communication  frequency 
between  them,  normalized  on  the  whole  network. 

7.4.2  Metric 

To  measure  the  quality  of  the  detected  communities  in  comparison  with  the 
embedded  groundtruths,  we  evaluate  Generalized  Normalized  Mutual  Information 
(NMI)  [55].  Basically,  the  NMI(U,V)  value  of  two  structures  U  and  V  is  1  if  U  and  V  are 
identical  and  is  0  if  they  are  totally  separated.  This  is  the  most  important  metric  for 
a  community  detection  algorithm  because  it  indicates  how  good  the  algorithm  is  in 
comparison  with  the  planned  communities.  Higher  NMI  values  are  expected  for  a  better 
community  detection  algorithm. 

7.4.3  Effect  of  link  stability  estimation 

We  first  evaluate  the  effect  of  our  link  stability  estimation  on  the  detection  of  network 
communities  by  comparing  NMI  values  of  SCD  and  its  version  with  No  Link  stability 
Prediction  (SCD-NLP).  Due  to  space  limit,  results  of  SCD  and  SCD-NLP  are  also 
reported  in  Figure  7-2,  where  those  on  general  community  structure  detection  are  also 
presented. 
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Blondel  Infomap  Oslom 


Blondel  Infomap  Oslom 
NetHEPT-WC 


Blondel  Infomap  Oslom 


Figure  7-3.  Performance  of  SCD  in  detecting  stable  communities  on  real  social  traces. 

In  general,  SCD-NLP  performs  very  competitively  even  without  being  preprocessed: 
on  synthesized  networks  with  no  community  size  constraint  (Figure  7-2A),  its  discovered 
communities  are  almost  of  perfect  similarity  to  the  embedded  ones  (NMI  values 
approximately  1)  on  n  =  [0...0.5]  whereas  the  quality  drops  down  quickly  when  /j  is 
above  0.5.  We  note  that  this  drop  of  detection  quality  is  controversial  and  does  not 
necessary  imply  a  bad  performance  since  networks  with  /i  >  0.5  is  consider  very 
stochastic,  and  thus,  may  not  contain  a  clear  community  structure.  Nevertheless,  with 
the  help  of  the  stability  estimation,  the  performance  is  now  boosted  up  significantly  on 
SCD  as  the  detection  qualities  are  very  high  even  for  //  >  0.65  (A/  =  1000)  and  /j  >  0.75 
(A/  =  5000),  and  only  drop  down  when  the  networks  are  extremely  stochastic  (/j  >  0.8). 

We  next  take  a  look  at  the  cases  where  networks  are  constrained  with  small-sized 
(Figure  7-2B)  and  big-sized  communities  (Figure  7-2C).  We  observe  that,  when 
community  sizes  are  constrained,  SCD-NLP  performs  much  better  than  before  and 
even  overcome  its  prior  limit  n  =  0.5.  In  particular,  the  performance  of  SCD-NLP  closely 
approaches  that  of  SCD,  especially  in  large  networks  (A/  =  5000).  However,  SCD-NLP 
appears  to  be  sensitive  to  big-size  communities  in  small  networks  as  its  quality  drops 
down  quickly  in  Figure  7-2C  (left),  and  seems  to  favor  small-size  communities  as  its 
plots  tend  to  tangle  with  those  of  SCD  (Figure  7-2B).  SCD  detection  quality,  thanks  to 
the  stability  estimation,  stays  wealthy  in  all  test  cases. 
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In  summary,  these  results  indicate  that  (1)  without  the  stability  estimation  process, 
our  suggested  metric  7 Z  appears  to  be  a  very  good  one  to  detect  community  structure 
in  general  directed  and  weighted  networks,  and  (2)  when  the  community  size  is 
constrained,  link  stability  estimation  has  a  little  effect  on  the  community  detection 
quality.  However,  in  real-world  social  network  settings  where  community  sizes  are 
typically  unknown,  and  therefore  unconstrained,  the  stability  estimation  has  a  significant 
effect  on  the  detection  of  network  communities.  These  experiments  also  confirm  the 
efficacy  of  our  proposed  stability  estimation  procedure. 

7.4.4  General  community  structure  detection 

We  next  investigate  on  SCD’s  ability  to  identify  general  network  community 
structure,  i.e.,  without  community  stability,  in  comparison  with  the  aforementioned 
state-of-the-art  detection  methods.  Results  are  reported  in  Figure  7-2. 

In  general,  the  performance  of  our  SCD  frameworks  on  synthesized  networks 
appears  to  be  better  than  those  of  Blondel  and  Infomap  methods,  and  only  lags 
behind  Oslom’s  when  the  networks  are  heavily  stochastic.  When  the  community  size 
is  unconstrained,  the  detection  quality  of  SCD  and  other  methods,  except  for  Blondel’s, 
retain  at  nearly  perfect  on  ji  =  [0...0.65]  (A/  =  1000)  and  /i  =  [0...0.8]  (A/  =  5000)  and 
then  all  degrade  quickly.  Among  the  three  methods,  Infomap’s  performance  appears  to 
be  sensitive  to  some  certain  mixing  threshold  /j  as  it  NMI  values  tend  to  drop  directly 
to  0,  whereas  Oslom  and  ours  tend  to  drop  down  slower.  On  average,  the  NMI  values 
of  SCD  are  about  8%  and  3%  better  than  those  of  Blondel  and  Informap  methods,  and 
are  about  2%  lag  behind  those  of  Oslom  method.  Blondel’s  method,  on  the  other  hand, 
does  not  attain  a  good  performance  through  due  to  low  NMI  values  even  at  a  low  range 
of  mixing  value  /x.  An  possible  explanation  for  this  behavior  of  Blondel’s  method  is  due  to 
the  effect  of  resolution  limit,  as  we  shall  discuss  below. 

When  the  embedded  communities  are  constrained  with  small  and  large  community 
sizes,  we  observe  the  nearly  same  behavior  of  SCD,  Oslom  and  Infomap  methods  as 
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depicted  in  Figures  7-2B  and  7-2C.  Blondel’s  method  gets  a  significant  improvement  in 
these  cases  where  its  performance  is  closely  related  to  the  others.  As  we  discussed 
above,  one  possible  reason  for  the  bad  behavior  of  Blondel’s  method  is  due  to 
the  resolution  limit  of  modularity  objective  function  [35].  As  the  community  size  is 
unconstrained,  this  resolution  limit  can  mislead  Blondel  method  to  merge  some 
communities  whose  are  of  small  sizes  in  comparison  to  the  rest  of  the  network,  thus 
results  in  the  low  NMI  values.  On  the  other  hand,  this  resolution  limit  does  not  take  effect 
when  size  constraints  are  imposed  and  thus  the  significant  improvement.  Our  SCD 
framework,  as  shown  in  section  7. 3. 3. 2,  can  withstand  this  scaling  limit  as  its  obtains 
highly  competitively  results.  Moreover,  the  difference  between  our  SCD  and  other 
methods  are  insignificant  on  average  which  indicates  that  all  methods  are  able  to  detect 
network  communities  with  high  quality.  This  is  not  a  surprising  result  since  Blondel, 
Oslom  and  Informap  are  currently  state-of-the-art  methods  but  a  great  motivation  and 
award  for  our  SCD  framework. 

7.4.5  Results  on  stable  community  detection 

In  order  to  compare  our  results  to  the  consensus  of  other  detection  methods,  we 
will  adopt  a  strategy  recently  proposed  in  [59].  In  particular,  given  a  specific  community 
detection  method  A,  its  consensus  (or  stable)  communities  can  be  determined  by:  (i) 
execute  A  on  G  np  times  to  have  np  partitions  (ii)  find  the  matrix  D  =  (D(J)  where  Du  is 
the  probability  which  vertices  /  and  j  of  G  are  assigned  to  the  same  cluster  among  np 
partitions  (iii)  all  D,- s  that  are  below  a  threshold  r  will  be  disregarded  (iv)  Apply  A  on 
D  np  times,  so  to  create  np  partitions  and  (v)  if  all  partitions  are  equal,  stop  (the  result 
matrix  would  be  block  diagonal).  Otherwise  go  to  step  (ii).  As  suggested  in  [59],  the 
resulted  communities  are  ideal  candidates  for  stable  structures  as  members  commit 
to  their  communities.  We  also  compute  the  Jaccard  index  J(U,  V)  =  j^jfj  to  better 
evaluate  the  quality  of  the  detected  stable  communities.  Results  are  represented  in 
Figure  7-3. 
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As  illustrated  by  the  subfigures,  even  a  single  run  of  SCD  is  able  to  obtain  very  high 
NMI  scores  and  Jaccard  indicies  in  comparison  with  the  consensus  of  other  methods 
after  multiple  runs.  In  particular,  community  structures  discovered  by  SCD  on  NetHEPT 
and  NetHEPT.WC  obtain  nearly  70%  similarity  in  comparison  with  Blondel,  Infomap  and 
Oslom  methods,  meanwhile  the  Jaccard  indicies  indicate  that,  in  average,  almost  66% 
number  of  nodes  are  found  in  common  between  SCD  and  the  core  structure  of  other 
competitors.  This  show  that  communities  discovered  by  SCD  are  indeed  highly  overlap 
with  core  community  structures  identified  by  other  detection  methods,  which  in  turns 
implies  that  those  clusters  found  by  SCD  are  stable  with  high  confidence.  Surprising,  in 
both  NetHEPT  and  NetHEPT  WC  networks,  we  observe  the  high  similarity  among  the 
consensus  of  Blondel,  Infomap  and  Oslom  methods  even  with  difference  in  edge  weight 
distribution.  This  observation  indicate  those  identified  communities  by  SCD  are,  in  fact, 
stable  in  these  networks. 

Even  in  Facebook,  a  large  network  with  real  social  interactions,  the  similarity 
between  consensus  communities  discovered  by  other  methods  and  by  SCD  are  still  of 
high  similarity  with  nearly  60%,  50%  similarity  to  those  found  by  Blondel  and  Infomap 
methods  with  over  50%  overlap  in  the  stable  partitions  as  indicated  by  the  Jaccard 
indices.  The  achieved  NMI  values  in  comparison  with  Oslom  method  are  relatively  low 
as  their  core  communities  do  not  appear  to  highly  overlap  (Jaccard  index  of  only  35%). 
We  note  that  this  low  similarity  does  not  indicate  the  unstability  community  structure  of 
our  SCD  framework  since  communities  detected  by  Oslom  can  be  overlapped  with  each 
other,  while  SCD  works  towards  disjoint  community  structure.  Nevertheless,  as  just  a 
single  run,  the  above  competitively  results  in  reference  to  other  state-of-the-art  methods 
confirm  the  efficay  and  quality  of  our  method  in  detecting  stable  network  communities  in 
OSNs. 
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7.5  Conclusion 


In  this  work,  we  investigate  community  structures  in  directed  OSNs  with  more  focus 
on  community  stability.  As  an  effort  towards  the  understanding  of  stable  communities, 
we  suggest  an  estimation  procedure  which  provides  helpful  insights  into  the  stability 
of  links  in  the  input  network.  Based  on  that,  we  propose  SCD,  a  framework  to  identify 
community  structure  in  directed  OSNs  with  the  advantage  of  community  stability.  We 
explore  an  essential  connection  between  the  persistence  probability  of  a  community 
at  the  stationary  distribution  and  its  local  topology,  which  is  the  fundamental  point 
to  back  our  SCD  framework.  Finally,  we  certify  the  efficiency  of  our  approach  on 
both  synthesized  datasets  with  embedded  communities  and  real-world  social  traces, 
including  NetHEPT  collaboration  and  Facebook  social  networks,  in  reference  to  the 
consensus  of  other  state-of-the-art  detection  methods.  Highly  competitive  empirical 
results  confirm  the  quality  and  efficiency  of  SCD  on  identifying  stable  communities  in 
OSNs. 
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CHAPTER  8 

ASSESSING  NETWORK  COMMUNITY  STRUCTURE  VULNERABILITY 

8.1  Introduction 

As  a  first  study  on  assessing  the  vulneraibility  of  the  network  community  structure, 
in  this  paper,  we  take  the  first  step  on  understanding  how  the  failures  of  crucial  nodes 
in  the  network  will  affect  its  community  structure.  Particularly,  we  are  interested  in 
identifying  network  nodes  whose  removals  trigger  a  significant  restruction  of  the  current 
community  structure.  Formally,  given  the  input  network  and  a  positive  number  k,  we 
introduce  the  Community  Structure  Vulnerability  (CSV)  which  aims  to  find  out  a  set 
S  of  k  nodes  whose  removal  maximally  transforms  the  current  network  community 
structure  to  a  totally  different  one,  i.e.,  the  new  community  structure  resulted  from  the 
removal  of  S  is  of  least  similarity  to  the  original  one,  evaluated  via  the  Normalized 
Mutual  Information  [20]  measure. 

Knowledge  about  this  crucial  vulnerability  of  network  community  structure  is  of 
considerable  usage,  especially  for  social-aware  methods  in  mobile  ad-hoc  and  online 
social  networks  (OSNs).  To  give  a  sense  of  its  effects,  consider  message  forwarding 
in  DTNs.  Since  social-based  forwarding  strategies  in  DTNs  rely  on  the  highest  ranked 
nodes  in  each  community  to  forward  the  message  [47][80],  the  knowledge  of  this 
vulnerability  can  help  to  either  design  routing  algorithms  that  do  not  overload  those 
crucial  devices,  if  they  are  those  highly  ranked  ones  in  a  community,  or  to  design 
an  effective  backup  plan  when  some  of  them  may  fail  at  the  same  time.  In  worm 
containment  application  in  OSNs  [82][1 10],  this  knowledge  can  provide  helpful  insights 
into  the  protection  of  those  sensitive  nodes,  if  they  are  indeed  high  influential  users, 
once  worms  spread  out  in  the  network.  As  a  result,  the  identification  of  nodes  whose 
removal  triggers  a  massive  restruction  of  the  community  structure  is  extremely  important 
for  the  network’s  regular  operation.  However,  under  a  minor  structural  change  when  a 
node  is  excluded  from  a  community,  this  particular  community  can  either  stay  intact  if  the 


145 


removed  node  is  less  important,  or  can  be  broken  down  into  smaller  subcommunities 
which  can  further  be  merged  to  other  communities  if  the  current  node  is  of  great 
important  to  the  community.  This  unpredictable  transformation  of  network  communities 
together  with  their  large  scales  in  reality  make  the  assessment  of  community  structure 
vulnerability  a  fundamental  yet  challenging  problem. 

8.2  Problem  Definition 

In  this  section,  we  first  define  the  graph  notations  that  will  be  used  thoroughly  in 
this  paper.  We  then  describe  Normalized  Mutual  Information  (NMI)  [20],  a  concept 
in  Information  Theory,  as  a  metric  to  assess  the  difference  between  community 
structures  before  and  after  the  removal  of  important  nodes.  Finally,  we  formally  define 
the  Community  Structure  Vulnerability  problem  -  our  main  focus  in  this  paper. 

(Notations)  Let  G  =  (V,  E)  be  an  undirected  unweighted  graph  representing  a 
network  where  V  is  the  set  of  \  V\  =  N  nodes  (e.g.,  users),  and  E  is  the  set  of  \E\  =  M 
links.  For  any  node  u  e  V  and  a  set  C  c  V,  let  N(u),  du  and  d £  be  the  set  of  all 
neighbors  of  u,  its  degree  in  G  and  its  degree  in  C,  respectively.  Furthermore,  let 
nc  =  \  C\  be  the  number  of  nodes  and  mc  be  the  number  of  internal  edges  in  C. 

(Community  structure)  Denote  by  A  the  specific  community  detection  algorithm  that 

will  be  applied  on  G,  and  by  X  =  {Xi,  X2 _ XCx},  Y  =  {YXl  Y2 _ YCy}  the  two  (possibly 

overlapped)  community  structures  of  cx  and  cY  communities  detected  by  A  before  and 
after  the  removal  of  a  set  S  of  k  nodes  in  G,  respectively.  Mathematically,  X  and  Y  are 
represented  as  X  =  A(G)  and  Y  =  Gl(G[\/\S]),  where  G[\/\S]  is  the  subgraph  induced 

by  \/\S  on  G.  For  any  index  /  =  1 cx  and  j  =  1 _ cY,  let  x,  =  |X,|,  y,  =  |V)|,  and 

/7(7  =  | X,  n  Yj\.  Finally,  let  x  =  */>  y  =  Y.% i  X  and  Tl  =  nu  be  the  total  size 

of  communities  in  X  and  Y,  and  the  total  number  of  common  nodes  shared  between  X 
and  Y,  respectively. 

(Normalized  Mutual  Information)  In  order  to  evaluate  how  much  the  network 
community  structure  changes  before  and  after  the  removal  of  important  nodes,  we 
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utilize  the  concept  of  Normalized  Mutual  Information  suggested  in  [20].  Basically,  given 
two  structures  X  and  Y,  NMI(X,  Y )  is  1  if  X  and  Y  are  identical  and  is  0  if  X  and  Y 
are  totally  separated,  and  the  higher  the  NMI  score,  the  more  similarity  between  X 
and  Y.  As  a  result,  NMI  is  a  well-suited  metric  dedicated  for  certifying  the  quality  of 
community  structures  discovered  by  different  detection  algorithms.  The  effectiveness  of 
this  widely-accepted  measure  has  also  been  extensively  verified  in  the  literature  [55]. 
Formally,  NMI(X,  Y)  is  defined  as 


NMI(X,  Y) 


2/(X,  Y) 
H(X)  +  H(Y)' 


where  H(X),  H(Y)  and  /(X,  Y)  are  the  entropy  of  structures  X  and  Y,  and  the  Mutual 
Information  conveyed  between  them,  respectively.  More  details  about  NMI  formulation 
will  be  elaborated  in  our  analysis. 

(Problem  definition)  Finally,  the  Community  Structure  Vulnerability  (CSV)  problem  is 
formulated  as  follow. 

Definition  1 .  Given  a  network  represented  by  an  undirected  and  unweighted  graph  G, 
a  specific  community  detection  algorithm  A,  and  a  positive  integer  k  <  N,  we  seek  for  a 
subset  Sci/  such  that 


S=  argmin  {NMI(A(G),  A(G[V\T]))}. 

TCV,\T\=k 

In  other  words,  CSV  problem  seeks  for  a  subset  S  c  V  of  k  nodes  whose  removal 
results  in  the  maximum  difference  between  the  initial  community  structure  X  and  the 
new  community  structure  Y  detected  by  A  on  G[V\S].  We  call  S  the  Node-Vulnerability 
set  of  G  since  its  removal  maximally  transforms  network  communities  of  G  to  different 
structures. 

Remark.  The  formulation  of  CSV  requires  the  community  detection  algorithm  A 
as  an  input  parameter.  Because  there  is  not  yet  an  universal  agreement  or  accepted 
definition  of  a  network  community,  this  input  is  necessary  in  the  sense  that  different 
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algorithms  with  different  objective  functions  might  favor  different  sets  of  nodes,  and 
thus,  a  good  solution  set  for  one  community  detection  algorithm  may  not  be  good  for 
the  others.  However,  when  there  is  a  clear  objective  function  for  finding  community 
structure,  such  as  maximizing  Modularity  Q  [55]  or  the  total  internal  density  [80],  this 
requirement  can  be  lifted.  Nevertheless,  the  node  selection  strategy  that  relies  more  on 
the  input  network  and  less  on  the  community  detection  algorithm  is  always  of  desire. 

8.3  Analysis  of  NMI  Measure 

In  this  section,  we  investigate  the  possible  conditions  on  sizes  and  the  number 
of  communities  that  can  potentially  lead  to  either  the  global  or  local  minimization  of 
NMI (X,  Y).  We  stress  that  these  conditions  are  by  no  means  universal  or  exhaustive 
since  some  of  them  might  not  hold  true  simultaneously,  given  the  input  parameters. 
Indeed,  what  we  hope  for  is  these  conditions  would  provide  us  key  insights  into  the 
selection  of  important  nodes  to  maximally  separate  X  and  Y.  In  the  coming  paragraphs, 
we  first  discuss  the  NMI  formulation  in  a  greater  detail,  and  then  analyze  it  in  terms  of 
both  disjoint  and  overlapping  community  structures. 

8.3.1  NMI  formulation 

To  evaluate  NMI(X,  Y)  [20]  where  X  =  {Xi,  X2 _ XCx}  and  Y  =  {Y1,Y2 _ YCy}, 

we  start  out  by  considering  community  assignments  X,  and  Y),  where  X,  and  V,  indicate 
the  community  labels  of  a  node  t  in  X  and  Y,  respectively.  Without  loss  of  generality,  we 
can  aslo  assume  that  the  labels  X,  and  V,  are  also  values  of  two  random  “variables”  X 
and  Y  (here  we  reuse  notations  X  and  X  to  denote  the  two  random  variables),  with  joint 
distribution 

P(X„  Yj)  =  P(X  =  X,;  Y=Yi)  =  nu/(N  -  k ), 

and  individual  distribution 

P(X,)  =  P(X  =  X,)  =  Xj/N, 
p(Yj)  =  P(Y  =  Yj)  =  yj/(N  -  k). 
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The  entropy  (or  uncertainty)  of  X  and  Y  is  defined  as  [18] 


H(X)  =  -  E  P(X>  l09  p(X<)  =  -  E  ^  l09  Jf 


i  =  1 


H(y)  =  -Ep( 


j= 1 

1 


(y  log  (A/  -  k)  -  y,  log  y,) . 


j=i 

cy 


N  -  k 


Note  that  in  CSV  problem,  X  can  be  derived  straightforwardly  based  on  A  and  G,  and 
thus,  quantities  x/s  can  also  be  inferred  from  these  input  parameters.  Therefore,  we 
simply  consider  x/s  and  H(X)  as  constants  in  this  paper. 

The  Mutual  Information  /(X,  Y)  [18]  of  two  random  variables  X  and  Y  is  defined  as 


Cx  Cy 


P(K.  Yj) 


l(X,  y)  =  EEp(x-v'j)|°9 


P(X,)P(Yj) 


/  =  1  j=  1 


This  measure  is  symmetric  and  it  tells  us  how  much  we  know  about  variable  (or 
structure)  Y  if  we  already  know  about  variable  X,  and  vice  versa.  However,  as  indicated 
in  [20][55],  Mutual  Information  itself  is  not  ideal  as  a  global  similarity  metric  since 
any  subpartition  of  a  given  community  structure  X  would  result  in  the  same  mutual 
information  with  X,  even  though  they  can  possibly  be  very  different  from  each  other.  As 
a  result,  [20]  introduces  the  Normalized  Mutual  Information  which  can  overcome  that 
limitation.  Formally,  NMI  of  two  random  variables  X  and  Y  is  defined  as 


NMI(X,  Y) 


2/(X,  Y) 


(8-1) 


H(X)  +  H(Y) 


In  term  of  notations,  NMI(X,  Y)  can  be  written  as 


(N  -  k)H(X)  +  y  log  (A/  -  k)  -  Ejli  Yjlo9  Yj 


(8-2) 
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8.3.2  Minimizing  NMI  in  a  disjoint  community  structure 

When  network  communities  are  disjoint  from  each  other,  we  have  X,  n  Xs  =  0, 

\J%XX;  =  V,  Yj  n  Yt  =  0,  and  u^Y}  =  V\S  for  all  /',  s  =  1 _ cx  and  all  j,  t  =  1 _ cY. 

As  a  result,  the  following  equalities  hold  true:  x  =  Yx,h  Y  =  A/,  y  =  ^  =  N  -  k  and 

71  =  T,ij  nu  =  N  -  k  (*). 

8.3.2.1  Minimizing  NMI  within  a  community 

We  first  investigate  the  behavior  of  NMI(X,  Y )  in  a  special  case  where  only  one 
specific  community  of  X  is  affected  by  the  removal  of  set  S  of  k  nodes  while  other 
communities  stay  intact.  We  can  assume  that  Xx  is  the  targeted  community  which  is 

further  divided  into  p  smaller  subcommunities  of  sizes  si,  s2 _ sp  satisfying  Ylj=i  sj  = 

X\  -  k.  In  this  case 


H(Y)  =  '£ 

j=  1 


SJ 

N-k 


log 


N-k 

SJ 


Y 

N-k 


log 


N-k 

Xi 


(xi  -  k)  log  (A/  -  k)  +  E,=2  x>  lo9  ^  _  i  SJ  lo9  sj 

N-k 


and 


/(X,  Y)  = 


E 


Sj 

N-k 


i  N 

log  — 
xi 


Cx 

E 

/=  2 


Xj 

N-k 


log 


A/ 

xi 


Xi  —  k 

N-k 


log  — 
xi 


Cx 

E 

/=2 


Xj 

N-k 


Xj 


Thus,  NMI(X,  Y)  is  minimized  when  J2j=i  sj  lo9  sj 's  minimized.  Since  function  s  log  s  is 
strictly  convex  for  any  s  >  0,  we  apply  Jensen’s  inequality  [18]  to  this  summation  and  get 


Xi,  Xi 

-  >  Sj  log  Sj  >  — - log  — - =  —  log  — , 

P  P  P  P  P 

j= i 


with  the  equality  holds  when  all  s/s  are  equal  to  each  other.  It  reveals  from  this 
inequality  that,  in  order  to  further  minimize  the  RHS  quantity,  one  can  try  to  break  X1 
into  as  many  smaller  communities  of  the  relatively  same  size  as  possible  (i.e.,  to  enlarge 
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p  as  much  as  possible  while  ensuring  s/s  are  all  equal).  This  intuition  makes  senses 
since  a  new  structure  of  Xi  with  all  singleton  communities  will  incur  YfJ=i  sj  lo9  sj  = 
and  hence,  will  maximize  H(Y )  and  in  turn  will  minimize  NMI(X,  Y).  However,  since  the 
new  structure  of  Xx  depends  on  the  community  detection  algorithm  A,  the  all-singleton 
communities  scenario  might  not  always  be  the  case.  Furthermore,  will  this  crucial 
observation  hold  true  in  a  general  disjoint  and  overlapping  community  structure?  We 
tend  to  lean  over  the  affirmative  answer  through  our  analysis  in  the  coming  subsections. 
8.3.2.2  Minimizing  NMI  in  a  general  disjoint  community  structure 

In  general  disjoint  community  structure,  the  equalities  (*)  help  to  simplify  NMI(X,  Y) 
(eq.  8-2)  to 

2E£iE£i"«i°g^ 

0 N  -  k)H(X)  +  (N-k )  log (N  -  k)  -  y7  log  Yj  ' 

In  order  to  minimize  the  above  ratio,  one  would  seek  for  the  conditions  in  which  the 
numerator  of  NMI(X,  Y)  is  minimized  while  its  denominator  is  also  maximized.  To 
maximize  the  latter  quantity,  we  need  to  minimize  E7= 1 X  lo9  X-  Applying  Jensen’s 
inequality  to  this  summand  gives 


Cy 


Y  yj  lo9  Yj  >  —  log  — 


y 


Cy 


j=  1 


Cy 


Cy 


N-k 
- log 

Cy 


N-k 

Cy 


and  thus  Yj%\Yj  lo9y/  can  attain  it  minimum  at  ( N  -  k)  log  ^0  with  equality  holds 
when  all  y/s  are  equal  to  each  other.  As  N  and  k  are  input  parameters,  log  ^0-  can 
further  be  minimized  when  cY  is  as  large  as  possible,  while  requiring  y/s  to  be  equal 
to  each  other.  Mathematically,  this  can  be  achieved  when  Y  contains  exactly  cY  = 

(N  -  k)  singleton  communities.  However,  since  our  problem  depends  on  the  detection 
algorithm,  this  inequality  advises  that  the  newly  community  structure  Y  should  contain 
as  many  communities  of  relatively  the  same  size  as  possible.  We  take  into  account 
this  observation  as  it  will  play  a  key  role  in  our  important-node  selection  process.  This 
observation  is  also  coincident  with  what  inferred  in  the  prior  special  case,  and  intuitively 
agrees  with  the  concept  of  Critical  Node  Detection  (CND)  [25]  and  Balanced  Graph 
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Partitioning  (BGP)  [2]  whose  goals  aim  to  delete  nodes  and  cut  the  input  graph  into  p 
connected  components  of  relatively  the  same  size.  However,  CSV  fundamentally  differs 
from  these  problems  in  the  senses  that  connected  components  in  BGP  and  CND  do  not 
necessarily  reflex  network  communities. 

In  order  to  minimize  the  numerator,  we  rewrite  it  as 


;(X’  =  n'J  lo9  “T  ~  nv log 


ij  J  ij 


Applying  Log  Sum  Theorem  [18]  to  the  first  summand  gives 


because  n  =  y  =  N  -  k  and  Ys%  1 11  u  =  x /  —  //,  V/'  =  1 _ cx,  where  I,  is  the  number 

of  deleted  (or  lost)  nodes  in  community  X,,  and  I’s  satisfy  h  =  k ■  The  equality 

holds  when  n  jyj  is  a  constant,  say  7  >  0-  for  all  /  =  1 _ cx,j  =  1 _ cy.  If  we 

assume  that  this  is  the  case,  then  YljLi  nu  =  lY^=\ Yj  =  l(N  ~  k)>  which  in  turn  implies 
N  -  k  =  nu  =  cxl{N  -  k).  Hence,  7  =  l/cx  and  thus,  I,  =  x,  -  (A/  -  k)/cx.  Therefore, 
to  minimize  the  second  summand,  the  equation  /,  =  x,  -  (A/  -  k)/cx  advises  that  we 
should  put  more  focus  on  (i.e.,  remove  more  nodes  in)  big-sized  communities  X,  of  X 
to  break  it  into  smaller  modules.  This  breaking  down  of  big-sized  communities  partially 
supports  the  prior  observation  that  communities  of  Y  should  have  relatively  the  same 
size.  Note  that  in  this  analysis,  we  have  assumed  that  /?,7 /y,  is  a  constant  for  all  pair  of 
/  and  j.  In  practice,  this  might  not  always  be  the  case  since  real  communities  can  be 
distributed  differently  based  on  the  underlying  detection  algorithm.  Nevertheless,  we  find 
this  observation  helpful  as  it  suggests  a  general  instruction  for  selecting  important  nodes 
in  the  network. 
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8.3.3  Minimizing  NMI  in  an  overlapped  community  structure 

The  minimization  of  NMI(X,  Y )  measure  is  much  more  complicated  when  network 
communities  can  overlap  with  each  other.  In  particular,  the  conditions  U^X,  =  \/  and 
u Yj  =  \/\S  still  hold  in  this  case;  however,  X,  n  Xs  and  V,  n  Yt  might  not  be  empty 

for  some  /,  s  =  1 _ cx  and  j,t  =  1 _ cY.  These  facts  indicate  that  x  =  E,=i  x>  >  N . 

y  =  E_2i  yj>  N  -  k  and  n  =  £\.  nu  >  N  —  k. 

Our  analysis  strategy  in  this  case  is  similar  to  the  prior  one  as  we  also  strive  for 
maximizing  the  denominator  while  minimizing  the  numerator  of  NMI(X,  Y)  (eq.  8-2). 
Because  Ji  >  N  -  k,  the  minimization  of  the  top  term  /(X,  Y)  no  longer  depends  only  on 
x,’s  anymore.  One  way  to  work  around  this  issue  is  to  investigate  the  relative  correlation 
between  the  total  community  size  y  and  the  number  of  communities  cY.  Let  aA  =  ^  be 
the  ratio  between  these  two  quantities,  or  in  other  words,  the  averaged  community  size. 
The  denominator  of  NMI(X,  Y)  is  evaluated  as 

y  log ( A/  -  k)-  J^y/logy,-  <  y^log(/V  -  k)  -  log^/Cy^ 

=  y  log  (N  -  k)  -  aA  log  aA. 

with  equality  holds  when  all  y/s  are  equal  to  each  other.  To  further  maximize  this 
denominator,  we  need  y  to  be  as  large  as  possible  while  keeping  aA  as  small  as 
possible,  i.e. ,  the  new  community  structure  Y  should  contain  more  and  more  communities 
as  to  increase  cY  as  well  as  to  lower  down  aA. 

Due  to  the  dependence  on  the  specific  detection  algorithm  A,  this  optimization  on 
the  correlation  between  y  and  cY  might  not  be  globally  achieved.  However,  a  coarse 
analysis  between  y  and  cY  can  relatively  be  conducted  in  the  following  senses:  if  we 
assume  that  y  is  within  a  constant  factor  of  the  total  number  of  actual  nodes  ( N  -  k), 
i.e.,  y  <  a0(/V  -  k)  for  some  constant  a0  >  1,  we  can  then  increase  the  value  of  the 
RHS  by  breaking  as  many  communities  as  possible  while  keeping  them  having  the  size 
(i.e.,  enlarge  cY  and  keep  y/s  are  all  the  same),  which  helps  to  reduce  the  impact  of 
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«4 log  aA.  This  observation,  though  relative,  agrees  with  what  we  achieved  in  the  case 
of  disjoint  community  structure.  In  an  unfortunate  case  where  y  is  not  known  to  be  within 
any  constant  factor  of  (A/  -  k),  the  observation  might  not  hold  since  both  y  and  cY  can 
be  arbitrary  large  and  thus,  0.4  log  aA  could  still  be  relatively  small. 

Next,  applying  Log  Sum  Theorem  on  the  numerator  yields 


/(X,  Y)  =  ^2  nu  lo9  —  >  n  log 

“  XiYj  xy 


with  equality  holds  when  ^  is  a  constant  for  all  /  =  1 _ cx  and  j  =  1 _ cY.  Thus, 

one  can  try  to  minimize  /(X,  X)  by  deleting  nodes  in  such  a  way  that  7?  is  maximized  and 
y  is  minimized  while  making  sure  that  ^  is  a  constant.  As  a  result,  this  minimization 
of  /(X,  Y)  is  a  multiple-objective  optimizations  problem  which  may  not  have  a  feasible 
solution.  However,  if  we  assume  that  the  later  condition  is  imposed,  i.e.,  ^  =  /3A  for 
some  constant  (3A  >  0,  then  n u  =  pp,  and  thus  n  =  pxy.  This  reduces  the  above 
inequality  to 

l(x,Y)>^f3Ay\og(3AN. 

The  RHS  of  the  inequality  advises  that,  in  order  to  minimized  /(X,  V),  the  total  size 
of  network  communities  should  not  be  too  large  while  the  overlapping  ratio  of  every 
community  should  be  equal  to  each  other  and  be  as  small  as  possible.  This  is  a  different 
criterion  from  the  disjoint  community  structure  point  of  view. 

8.4  A  Solution  to  CSV  Problem 

In  the  following  paragraphs,  we  consider  the  scenario  when  maximizing  the 
internal  density  [80]  is  the  objective  function  for  finding  network  communities,  i.e., 
communities  of  G  are  assumed  to  have  optimized  internal  densities.  In  this  manner, 
we  present  genEdeg,  an  algorithm  for  solving  CSV  problem  that  is  independent  of  the 
underlying  community  detection  algorithm  A.  Our  solution  strategy  will  try  to  break 
larger  communities  to  as  many  small  ones  as  possible  while  looking  for  them  to  have 
the  relatively  same  size  with  small  overlapping  ratios.  The  idea  of  our  strategy  is  based 
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on  the  following  intuition:  since  communities  in  X  are  optimized  for  their  internal  density, 
they  are  likely  to  contain  strong  substructures  that  are  tightly  connected  which  form  the 
cores  of  these  communities.  As  a  result,  the  removal  of  crucial  nodes  in  a  core  might 
potentially  break  the  community  into  smaller  modules.  Moreover,  as  nodes  in  a  core  are 
tightly  connected,  there  should  be  some  edge  that  generate  them,  i.e.,  all  nodes  in  the 
core  are  incident  to  both  endpoints  of  this  edge.  Inspired  by  this  intuition,  our  strategy 
works  towards  the  identification  of  these  generating  edges  of  a  community,  and  then 
seek  for  the  minimum  set  of  generating  edges  that  composes  the  original  communities. 

Let  D  be  a  subset  of  V.  Denote  by  v|/(D)  =  the  internal  density  of  D  and  by 

_ 2 

t(D)  =  flD(f12p~1)  nD{nD-v  the  threshold  function  on  the  internal  density  of  D,  respectively. 
For  any  nodes  u,  v  e  D,  if  edge  (u,  v)  is  not  in  E,  we  call  it  a  missing  edge  in  D.  In 
addition,  we  call  an  edge  in  D  “negative”  if  it  is  incident  to  a  missing  edge  in  D,  and 
“positive”  otherwise.  We  define  the  concept  of  generating  edges  of  D  as  follow 
Definition  2.  (Generating  edge)  For  any  edge  ( u ,  v)  in  D,  if  D  =  (D  n  N(u)  n  N(v))  u 
{u,  v}  and  Ur(D)  >  r(D),  we  call  (u,  v)  a  generating  edge  of  D.  We  further  call  D  a  local 
core  generated  by(u,  v)  and  write  gen(u,  v)  =  D. 

For  any  community  C  of  G,  a  set  L  c  E  is  called  a  “generating  edge  set”  of  a  C 
if  u (u,v)eLgen(u,  v )  =  C.  Since  C  can  be  generated  by  different  generating  edge  sets 
and  we  are  constrained  on  the  node  budget,  we  would  intuitively  seek  for  the  generating 
edge  set  of  minimal  cardinality. 

Definition  3.  (Minimum  Generating  Edge  Set)  Given  a  community  C  of  G,  the  MGES 
problem  seeks  for  a  generating  edge  set  L*  of  C  with  the  smallest  cardinality. 

The  cores  generated  by  edges  in  a  MGES  of  a  community  C  of  G  are  tightly 
connected  and  they  all  together  compose  C.  As  a  result,  if  we  delete  an  endpoint 
of  every  edge  in  a  MGES,  C  will  be  broken  into  smaller  modules  with  the  number  of 
modules  is  at  least  the  number  of  edges  in  a  MGES  (Lemma  16).  Since  our  goal  is  to 
break  the  current  community  structure  X  into  as  many  new  communities  as  possible, 
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the  removal  of  crucial  nodes  defined  by  edges  in  a  MGES  will  be  a  good  heuristic 
for  this  purpose.  But  first  and  foremost,  we  need  to  characterize  all  MGESs  in  the 
current  community  structure  X  based  only  on  the  input  network  G.  Lemma  17  realizes 
the  location  of  the  generating  edge(s)  of  a  local  core  in  a  community  C:  they  have  to 
adjacent  to  nodes  with  the  highest  degree  in  C.  Based  on  this  result,  we  present  in  Alg. 
18  a  procedure  that  can  correctly  find  the  MGES  of  a  given  community  C  (Theorem  8.1). 


Algorithm  18  An  optimal  algorithm  for  finding  the  MGES 
Input:  Network  G  =  (V,  E)  and  a  community  G  e  X; 

Output:  Minimum  generating  edge  set  L*  of  C; 

0.  Mark  all  nodes  as  “unassigned”  and  L*  =  0. 

1.  Remove  all  negative  edges  in  C.  If  any  edge(s)  survive,  they  are  candidate  for 
generating  edges  in  their  corresponding  communities,  including  them  to  L*,  go  to  step 

2.  Else,  go  to  step  3. 

2.  Reconstruct  local  cores  based  on  generating  edges  found  in  step  1.  Mark  all  nodes 
in  those  communities  as  “assigned”.  Discard  generating  edges  in  L*  that  fall  into  any 
newly  constructed  communities.  Return  if  all  edges  are  assigned. 

3.  Find  the  set  U  as  in  Lemma  17.  Find  the  edge  in  NE(U )  that  can  generate  a  local 
community  having  the  largest  size.  Include  this  edge  to  L*  and  mark  all  nodes  in  the 
new  local  community  as  “assigned”.  Ties  are  broken  randomly.  Return  if  all  edges  are 
assigned. 

4.  If  there  are  still  unassigned  nodes,  say  the  set  /  c  C,  construct  Gi  =  G[(/  u  A/(/))  n 
C\.  Go  to  back  to  step  1. 


Lemma  16.  Let  L*  be  a  MSGE  of  a  community  C.  The  removal  of  an  endpoint  in  every 
edge  of  L*  will  break  C  into  at  least  \L*\  subcommunities. 

Proof.  Clearly,  the  removal  of  an  endpoint  of  every  edge  in  L*  will  degrade  the  internal 
density  of  each  core  since  the  endpoint  of  the  generating  edge  is  of  full  degree  in  its 
core.  Now,  if  the  number  of  subcommunities  resulted  in  the  node  removal  is  less  than 
\L*\,  it  means  there  are  at  least  two  cores  that  are  merged  together.  That  is  there  are 
cores  ci  and  c2  are  merged  together  even  with  less  internal  density.  This  should  not  be 
the  case  since  otherwise,  they  have  to  be  identified  as  a  single  core  at  the  first  place. 
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Their  combination,  as  a  result,  implies  that  C  has  a  MGES  of  size  less  than  |L*|,  which 
raises  a  contradiction  to  the  assumption  that  L*  is  a  MGES  of  C.  □ 

Lemma  17.  Let  C  be  a  subset  of  V,  U  =  {u  e  C\d^  is  the  highest  in  C}  and  NE(U )  = 
{(u,  v)\u  g  U  orv  g  U  but  not  both}.  Then,  \  NE(U)  n  L*\  >  1. 

Proof.  After  each  refreshment  in  step  2,  let  u  be  the  node  with  the  highest  indegree  in 
C.  After  step  1  of  Alg.  18,  all  negative  edges  are  deleted  since  they  do  not  contribute  to 
the  actual  generating  set  L*.  As  such,  edges  incident  to  u  are  not  negative.  This  in  turn 
implies  that  they  are  candidates  for  generating  edges.  Now,  iterate  through  all  edges 
incident  to  u  and  choose  the  one  that  generates  the  biggest-sized  core.  This  edge 
should  be  in  the  list  L*.  □ 

Theorem  8.1.  Let  dc  be  the  maximum  in-degree  of  a  node  in  C.  Alg.  18  takes  0(dc\C\) 
time  in  the  worst  case  scenarios  and  returns  an  optimal  solution  for  MGES  problem. 

Proof.  Since  every  time  Lemma  17  makes  sure  that  at  least  one  edge  should  be  added 
to  L*  and  the  procedure  terminates  when  no  edges  left,  the  Alg.  18  should  terminate. 
Moreover,  it  is  verifiable  that  Alg.  18  take  at  time  as  most  the  number  of  edges  in  C, 
which  is  0(dc\C |).  Also,  due  to  the  intense  internal  density  of  a  core,  every  time  an 
edge  is  added  into  L*,  that  edge  actually  generates  the  largest  core  possible.  The  proof 
follows  from  this  fact,  Lemma  17  and  the  exhaustive  property  of  Alg.  18.  □ 

Algorithm  19  gen  Edge  -  A  node  selection  strategy  for  CSV  based  on  generating  edges 
Input:  Network  G  =  (V,  E),  X  =  A(G); 

Output:  A  set  S  c  V  of  k  nodes; 

1 .  Use  Alg.  1 8  to  find  L*x.  for  all  communities  X,’s  in  X. 

2.  Sort  all  communities  X,’s  in  X  by  their  sizes  of  MGSEs. 

3.  Sort  all  nodes  in  G  by  the  number  of  generating  edges  that  they  are  incident  to  in 
X/.  If  there  is  a  tie,  sort  them  by  their  degrees  in  G. 

4.  Return  top  k  nodes  in  step  3. 

With  the  optimal  solution  of  MGES  taken  into  account,  we  next  suggest  a  heuristic 
for  selecting  important  nodes  following  the  guidelines  suggested  in  the  previous.  In 
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particular,  our  heuristic  selects  nodes  in  a  greedy  manner,  starting  from  communities 
that  have  large-size  MGESs.  Moreover,  in  the  MGES  of  each  community  C,  we  give 
priority  to  nodes  that  are  incident  to  more  generating  edges  since  their  removals  will 
break  C  into  more  subcommunities. 

8.5  Experimental  Results 

In  this  section,  we  show  the  empirical  results  of  our  node  selection  strategy  for  CSV 
on  both  synthesized  networks  with  known  community  structures  and  real-world  social 
traces  including  the  Reality  mining  cellular  dataset  [29],  Facebook  [100]  and  Foursquare 
[21]  social  networks.  In  order  to  certify  the  performance  of  our  approach,  we  compare 
the  results  obtained  by  the  following  methods:  High  degree  centrality  ( highDeg )  selects 
top  k  nodes  in  G  with  the  highest  degrees,  betweeness  centrality  ( betweeness )  selects 
top  k  nodes  in  G  with  the  highest  betweenesse s  (where  the  betweeness  of  a  node  u  is  the 
number  of  shortest  paths  in  G  that  pass  through  u).  Generating  edges  ( genEdge )  -  our 
strategy  described  in  Alg.  19,  and  finally,  Node  Importance  ( nodeimp )  [105]  selects  top  k 
nodes  by  their  importance  to  the  community  structure. 

We  first  examine  the  effect  of  the  underlying  community  detection  methods  by 
comparing  results  obtained  by  AFOCS  [80],  Blondel  [6]  and  Oslom  [61]  algorithms  to 
the  embedded  groundtruths.  In  particular,  we  set  X  to  be  the  groundtruth  community 
structure  and  when  S  is  removed  from  the  network  NMI(X,  Y )  is  reported,  where 
Y  =  AFOCS(G[V\S ]),  V  =  Blondel (G[V\S])  and  Y  =  Oslom(G[V\S ]),  respectively. 
These  methods  have  been  empirically  certified  in  the  literature  to  the  best  algorithms 
for  finding  non-overlapping  and  overlapping  community  structure  [55].  Verifying  our 
strategy  on  synthesized  networks  not  only  certifies  its  performance  but  also  provides  us 
the  confidence  to  its  behaviors  when  applied  to  real-word  traces.  We  next  demonstrate 
the  following  quantities  (1)  the  NMI  differences  between  community  structures  before 
and  after  the  node  removal,  which  is  our  main  objective  function,  (2)  the  number  of 
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A  NMI  scores  by  AFOCS 


k 


B  NMI  scores  by  Blondel 


C  NMI  scores  by  Oslom 


Figure  8-1.  Comparison  among  different  node  selection  strategies  on  synthesized 
networks  with  N  =  2500  nodes 
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A  NMI  scores  on  AFOCS 


B  NMI  scores  on  Blondel 


C  NMI  scores  on  Oslom 


Figure  8-2.  Comparison  among  different  node  selection  strategies  on  synthesized 
networks  with  N  =  5000  nodes 
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communities  in  the  new  structure,  and  (3)  the  average  size  of  the  network  communities 
in  the  new  structure. 

8.5.1  Results  on  synthesized  networks 

Set  up:  We  use  the  well-known  LFR  overlapping  benchmark  [55]  to  generate 
test  networks.  The  number  of  nodes  are  N  =  2500  and  5000,  the  mixing  parameter 
p  =  0.15,  the  community  sizes  cmin  =  10  and  cmax  =  50  for  N  =  2500  and  cmm  =  30 
and  cmax  =  100  for  N  =  5000.  At  every  k  nodes  are  removed  from  the  network,  the 
network  community  structure  is  reidentified  and  compared  to  the  original  embedded  one 
(or  the  ground-truth).  The  overlapping  threshold  (3  in  AFOCS  is  set  at  0.7  and  all  tests 
are  averaged  on  100  runs  for  consistency. 

8.5.1. 1  Solution  quality 

We  first  evaluate  the  performance  of  all  aforementioned  node  selections  strategies 
on  different  community  detection  algorithms  AFCOS,  Blondel  and  Oslom,  respectively. 
Because  the  ground-truth  communities  on  synthesized  networks  are  given  a  priori, 
comparisons  through  NMI  scores  among  these  strategies  as  well  as  among  detection 
algorithms  are  therefore  valid,  and  the  lower  NMI  scores  a  strategy  obtains,  the  more 
effective  it  seems  to  be.  In  addition,  the  higher  the  remaining  NMI  measure  a  detection 
algorithm  obtains  after  the  node  removal,  the  more  resistant  to  node  vulnerability  it 
seems  to  be. 

The  quality  of  node  selection  solutions,  are  reported  in  figures  8-1  and  8-2.  In  a 
general  trend,  NMI  scores  tend  to  drop  down  quickly  as  more  nodes  are  removed  from 
the  network  when  N  =  2500;  however,  they  degrade  much  slower  in  networks  with 
N  =  5000.  The  first  observation  revealed  in  those  figures  is  that  our  approach  genEdge 
achieves  the  best  (lowest)  NMI  scores  on  almost  all  test  cases.  In  average,  on  networks 
with  2500  nodes,  genEdge  is  14%  better  than  both  highDeg  and  betweeness,  and  is  12% 
better  than  nodeimp  on  AFOCS  algorithm;  and  is  19%,  11%  and  5%  better  than  highDeg, 
betweeness,  and  nodeimp  on  Blondel  algorithm  (figure  8-1  A,  8-1 B).  On  Oslom  algorithm, 
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A  N  =  2500  B  N  =  2500 


C  N  =  5000 


D  N  =  5000 


Figure  8-3.  Results  obtained  by  AFOCS  on  networks  with  N  =  2500  nodes  and 
N  =  2500  nodes. 


genEdge  differs  insignificant  with  highDeg  and  betweeness  with  1.5%  and  1.4%  better, 
and  is  only  lagged  behind  nodeimp  with  3%  lower  NMI  scores.  On  network  with  5000 
nodes,  genEdge  still  outperforms  other  strategies  with  12%  lower  NMI  scores  than  the 
others  on  AFOCS  algorithm,  and  with  23%,  8%  and  6%  lower  NMI  scores  than  highDeg, 
betweeness  and  nodeimp  on  Blondel  algorithm,  and  finally,  with  7%,  10%  and  8%  better 
than  the  others  on  Oslom  algorithm  (figure  8-2).  These  results  imply  that  genEdge  node 
selection  strategy  performs  excellently  with  competitive  results  on  different  community 
detection  algorithm  in  comparison  with  other  strategies. 

The  second  observation  we  obtain  from  figures  8-1  and  8-2  is  that  the  top-of-the-list 
node  seems  to  be  essential  to  the  network  community  structure.  The  removal  of  only 
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this  node  from  the  network  brings  the  NMI  scores  to  as  low  as  0.7  -  0.8  on  AFOCS 
(figure  8-1  A,  8-2A),  to  0.58  -  0.6  on  Blondel  algorithm  (figure  8-1 B,  8-2B),  and  to  0.7 
on  Oslom  algorithm.  Furthermore,  the  top  15-20  nodes  are  also  vital  to  the  network 
community  structure  detected  by  Oslom  and  Blondel  since  their  destruction  brings 
the  NMI  scores  down  to  0.5,  the  threshold  where  the  community  structure  become 
stochastic  and  fuzzy  to  recognize.  The  NMI  values  on  AFOCS  algorithm,  on  the  other 
hand,  do  not  suffer  from  this  destruction  as  they  only  come  close  to  0.5  when  almost 
k  =  50  nodes  are  removed  from  the  networks  with  N  =  2500  nodes  (figure  8-1  A). 

Finally,  the  last  observation  inferred  from  figures  8-1  and  8-2  is  that,  among  the 
three  community  detection  algorithms,  AFOCS  algorithm  obtains  the  highest  remaining 
NMI  values  when  the  same  number  of  nodes  is  removed  from  the  networks.  In  other 
words,  AFOCS  was  able  to  detect  the  community  structure  which  was  of  the  most 
similarity  to  the  ground-truth  communities.  As  we  discussed  above,  this  observation 
implies  that  AFOCS  seems  to  be  the  detection  algorithm  which  is  more  resistant  to 
node  vulnerability  than  the  other  algorithms.  Therefore,  we  employ  AFOCS  as  the  main 
community  detection  algorithm  to  further  analyze  network  communities  of  real-world 
traces. 

8.5. 1.2  The  number  of  communities  and  their  sizes 

We  next  examine  the  number  of  communities  and  their  sizes  when  k  important 
nodes  are  removed  from  the  network.  As  discussed  in  subsection  8.4,  our  selection 
strategy  gives  priority  to  breaking  the  current  community  structure  into  more  communities 
while  looking  for  their  sizes  to  be  relatively  the  same  in  order  to  minimize  NMI  measure. 
The  results  are  presented  in  figure  8-3. 

As  reported  in  these  figures,  the  numbers  of  new  communities  generated  by 
genEdge  tend  to  increase  as  more  nodes  are  excluded;  however,  they  differ  insignificantly 
from  other  methods  on  small  networks  of  2500  nodes  (figure  8-3A),  but  the  differences 
become  more  visible  on  larger  networks  of  5000  nodes  (figure  8-3C).  In  particular,  the 
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Table  8-1 .  Statistic  of  social  traces 


Data 

N 

M 

Avg.  Deg 

Max. 

Com.  Size 

Reality 

100 

3100 

62 

35 

Facebook 

63731 

1.5M 

23.50 

33425 

Foursquare 

47260 

1 .1 M 

49.13 

30381 

number  of  communities  generated  by  genEdge  is  the  second  highest  when  N  =  5000 
(only  below  betweeness  method)  while  the  average  sizes  of  communities  are  relatively 
equal  to  other  methods  (figure  8-3B  and  8-3D).  One  might  question  why  the  NMI 
scores  returned  by  genEdge  is  still  high  since  its  number  of  communities  and  average 
community  size  are  relatively  the  same  as  the  other.  One  possible  reason  is  because 
new  communities  formed  by  other  strategies  might  possibly  be  the  subcommunities 
or  parts  of  of  the  original  structure,  which  in  turn  results  in  high  similarity  to  the 
ground-truth.  Our  strategy,  on  the  other  hand,  makes  sure  that  once  a  node  incident 
to  the  most  generating  edges  is  excluded,  the  subcommunity  structure  is  broken  and 
the  new  community  structure  has  little  similarity  to  the  original  one,  and  hence,  the  lower 
NMI  measures. 


8.5.2  Results  on  real-world  traces 

We  further  present  the  empirical  results  of  CSV  on  real-world  networks  including 
Reality  mobile  phone  data  [29],  Facebook  [100]  and  Foursquare  [21]  datasets.  The 
overview  of  these  datasets  is  summarized  in  Table  8-1 . 

Reality  Mining  dataset  provided  by  the  MIT  Media  Lab.  This  dataset  contains 
communication,  proximity,  location,  call,  and  activity  information  from  100  students 
at  MIT  over  the  course  of  the  2004-2005  academic  year.  Facebook  dataset  contains 
friendship  information  (i.e.,  who  is  friend  with  whom  and  wall  posts)  among  New 
Orleans  regional  network  on  Facebook,  spanning  from  Sep  2006  to  Jan  2009.  To 
collect  the  information,  the  authors  created  several  Facebook  accounts,  joined  each 
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C  Facebook 


Figure  8-4.  NMI  scores  on  Reality  mining  data,  Foursquare  and  Facebook  networks 
obtained  by  AFOCS  ( k  =  50. ..1000) 
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to  the  regional  network,  started  crawling  from  a  single  user  and  visited  all  friends  in  a 
breath-first-search  fashion.  Foursquare  dataset  contains  location  and  activities  of  47260 
users  on  Foursquare  social  network  on  May  201 1  -  Jul  201 1 .  To  collect  the  data,  we 
created  several  Foursquare  accounts,  joined  to  the  network,  started  crawling  from  a 
single  user  and  visited  all  friends  also  in  a  breadth-first-search  fashion. 

On  Reality  Mining  dataset,  we  set  k  =  1...20  and  report  result  in  figure  8-4A.  It 
reveals  from  this  figure  that  community  structure  in  this  dataset  is  extremely  vulnerable 
to  node  attacks  since  the  removal  of  only  2  nodes,  found  by  genEdge  is  enough  to  make 
the  new  community  structure  significantly  differs  from  the  original  one  as  it  brings  down 
the  NMI  values  to  0.4.  In  comparison  with  other  node  selection  methods,  genEdge  still 
perform  excellently  and  is  about  14%  -  17%  better  than  the  others.  We  note  that  the  first 
node  identified  by  genEdge  is  indeed  crucial  to  the  community  structure  of  this  network 
since  it  immediately  brings  down  NMI  score  to  0.6  while  the  other  does  not  seem  to 
discover  this  important  feature.  Furthermore,  when  too  many  nodes  are  removed  from 
the  network,  the  network  does  seem  to  contain  communities  any  more  or  the  community 
structure  become  extremely  fuzzy  as  NMI  values  converge  down  to  around  0.2.  This  is 
understandable  since  this  dataset  is  of  small  size  with  a  very  high  average  node  degree. 

On  larger  networks  Facebook  and  Foursquare,  we  set  k  from  50  nodes  to  1000 
nodes  (only  2.1%  and  1 .5%  number  of  nodes  of  Foursquare  and  Facebook  networks) 
with  a  50-node  increment  at  a  time.  The  numerical  results  are  reported  in  figure  8-4. 

In  general,  NMI  values  of  all  methods  degrade  quickly  on  Foursquare  networks,  and 
tend  to  decrease  slower  on  Facebook  networks.  As  more  nodes  are  excluded  from  the 
network,  genEdge  still  achieves  the  best  performance  on  both  networks  with  significantly 
lower  NMI  values  than  the  other  methods.  Specifically,  on  Foursquare  with  high  average 
degree  and  internal  community  density,  the  removal  of  nodes  incident  to  the  most 
generating  edges  in  genEdge  significantly  leads  to  the  separation  of  network  community 
structure  as  NMI  scores  drop  down  to  0.2  in  genEdge.  On  Facebook  network,  the 
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similarity  between  the  original  and  new  community  structure  seem  to  retain  fairly  high 
even  all  1000  nodes  are  removed,  whereas  the  new  structure  of  ArXiv  network  is  at 
the  edge  of  stochastic  threshold  since  the  NMI  measure  is  around  0.5.  This  implies 
that  community  structure  in  Foursquare  network  is  also  extremely  vulnerable  to  node 
removal  attacks,  while  the  mature  Facebook  network  does  not  seem  to  suffer  this  threat. 
One  possible  reason  for  this  is  since  Facebook  contains  a  giant  community  with  low 
average  degree,  it  therefore  requires  much  more  effort  in  order  to  break  that  giant 
community  apart. 

In  summary,  the  experiments  on  both  synthesized  and  real-work  social  network 
confirm  the  effectiveness  of  our  proposed  method  based  on  generating  edges.  The 
empirical  results  also  confirm  that,  genEdge  outperforms  other  heuristic  methods  on 
other  community  detection  methods  such  as  AFOCS,  Blondel  and  Oslom  algorithms. 

8.6  An  Application  in  DTNs 

We  present  a  practical  application  where  the  detection  of  overlapping  network 
communities  plays  a  vital  role  in  forwarding  strategies  in  communication  networks. 

In  order  to  evaluate  the  impact  of  community  restructuring  in  complex  networks,  we 
compare  the  set  of  critical  nodes  identified  by  our  community  structure  vulnerability 
algorithm  to  the  set  of  nodes  selected  using  aforementioned  algorithms.  Furthermore,  in 
order  to  evaluate  which  one  of  the  critical  node  set  is  the  most  critical,  we  study  how  the 
removal  of  the  critical  node  set  influences  the  performance  of  routing  in  Pocket  Switched 
Networks  (PSN),  in  terms  of  average  message  delivery  ratio,  and  delivery-time. 

PSNs  are  a  particular  case  of  DTNs,  where  the  nodes  of  the  network  correspond 
to  actual  people  that  are  equipped  with  portable  devices  (i.e.,  mobile  phones),  and 
that  use  these  portable  devices  to  communicate.  Because  of  the  high  degree  of 
mobility  of  this  type  of  networks,  a  path  between  a  source  and  a  destination  seldom 
exists,  therefore  most  of  the  approaches  to  routing  in  this  kind  of  environments  adopt  a 
store-carry-and-forward  approach.  In  store-carry-and-forward  approaches,  messages 


167 


are  stored  locally  and,  depending  on  the  approach,  they  are  forwarded  or  replicated  to 
the  encountered  nodes  when  an  opportunity  occurs.  In  this  manner,  a  node  is  important 
if  it  serves  as  a  hub  to  forward  the  messages  to  other  devices.  As  a  result,  the  failures  of 
these  important  nodes  shall  degrade  the  message  delivery  ratio  while  shall  incur  more 
duplicate  messages  and  delivery  time. 

We  use  the  HAGGLE  dataset  [94],  This  trace  was  collected  at  the  Infocom 
conference  in  2006  in  Barcelona.  70  students  and  researchers  attending  the  workshop 
were  equipped  with  iMote  devices  that  registered  they  encounter  for  the  duration  of  the 
conference  (3  days).  In  addition  to  the  70  mobile  partecipants,  approximately  20  static, 
long  range  iMotes  were  deployed  throughout  the  area  of  the  conference.  A  total  of  1000 
messages  are  created  and  uniformly  distributed  during  the  experiment  duration  and 
each  message  can  not  exist  longer  than  a  threshold  time-to-live.  In  our  evaluation  we 
will  focus  on  the  PSN  routing  algorithm  inspired  by  BubbleRap  [47],  While  we  expect 
the  performance  of  this  protocol  to  deteriorate  upon  the  removal  of  important  nodes, 
we  expect  the  performances  of  BubbleRap  to  deteriorate  more  quickly,  because  of 
the  reliance  of  the  protocol  on  the  community  structure.  Because  BubbleRap  relies 
on  the  knowledge  of  the  community  structure  to  route  the  messages,  and  because  we 
realize  that  different  algorithms  that  attempt  to  find  the  community  structure  use  different 
objective  functions  that  may  be  more  susceptible  to  the  removal  of  nodes,  we  consider 
evaluate  the  average  delivery  ratio,  average  delivery  time  and  the  average  number  of 
copied  messages. 

Results 

As  the  removal  of  10  node  in  Haggle  dataset  is  enough  to  make  it  original 
community  structure  to  become  stochastic  (figure  8-6),  we  fix  k  =  10  and  report  the 
results  as  a  function  of  time  -  to  -  live  (the  amount  of  time  a  message  can  exist).  The 
performances  of  all  methods  are  presented  in  figure  8-5.  As  reported  in  subfigures 
8-5A,  8-5B  and  8-5C,  the  removal  of  nodes  selected  by  genEdge  approach  significantly 
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Figure  8-5.  Simulation  results  on  HAGGLE  dataset. 
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Figure  8-6.  NMI  measure  on  Haggle  dataset. 

degrades  the  performance  of  BubbleRap  forwarding  and  routing  system  in  terms  of 
not  only  delivered  messages  and  time  but  also  the  numbers  of  copied  messages.  As 
depicted  in  subfigure  8-5A,  the  averaged  number  of  messages  delivered  by  BubbleRap 
under  genEdge  and  time  -  to  -  live  =  450s  is  only  two,  whereas  those  under  highDeg, 
betweeness  and  nodelmp  are  four,  three  and  three,  which  implies  100%  and  50% 
system  downgrade  when  only  10  nodes  are  excluded  from  the  networks.  This  also 
means  nodes  selected  by  genEdge  are  of  important  role  in  maintaining  the  normal 
operation  of  the  whole  network.  Furthermore,  when  nodes  are  removed  from  the 
network,  one  expects  that  the  delivery  time  should  be  increased  as  a  consequnce 
because  participants  now  have  less  chances  to  communicate  with  each  other,  and 
thus,  it  should  take  longer  for  participating  devices  to  forward  the  carried  messages. 

This  intuition  is  nicely  reflected  in  figure  8-5B.  As  reported  in  this  subfigure,  the 
average  amount  of  time  required  to  deliver  carried  messages  increases  significantly 
as  time  -  to  -  live  increases  (note  that  from  0-1 00s,  there  was  no  message  delivered, 
and  thus,  the  delivery  time  was  0).  In  terms  of  delivery  time,  the  removal  of  nodes  under 
genEdge  affects  the  system  to  requires  a  huge  extra  amount  to  deliver  the  messages  in 
comparison  with  other  methods.  In  particular,  the  system  delivery  time  under  genEdge  is 
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about  1 .25x,  1 ,7x  and  1 .21  x  higher  than  that  under  betweeness,  nodeimp  and  highDeg 
when  time  -  to  -  live  =  450.  Moreover,  the  number  of  copied  messages,  affected 
by  genEdge  approach,  is  also  the  highest  one  among  other  methods.  This  means 
that  genEdge  heuristic  algorithm,  indeed,  selects  appropriate  nodes  whose  effects 
significantly  reduce  the  system  performance  as  reported  by  the  three  evaluated  factors. 
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CHAPTER  9 
CONCLUSIONS 

In  this  dissertation,  we  establish  the  fundamental  knowledge  on  the  following 
aspects  of  the  complex  network  science  (1 )  the  network  organizational  principals  via 
the  discovery  of  its  dynamic  community  structure  (2)  the  assessment  of  the  community 
structure  vulnerability,  and  (3)  the  social-based  solutions  for  practical  applications 
enabled  by  complex  systems,  such  as  in  online  social  networks  and  mobile  networks. 

We  suggested  two  adaptive  frameworks  for  discovering  the  dynamic  network  community 
structure  and  analyze  theoretical  results  that  guarantee  their  performances.  In  the 
execution  perspective,  our  methods  are  adaptive,  and  thus,  are  scalable  for  very  large 
networks  with  very  competitive  experimental  results. 

To  investigate  the  assessment  of  the  network  community  structure  vulnerability, 
we  introduce  the  new  problem  of  identifying  key  nodes  whose  removal  can  maximally 
reform  the  current  network  communities.  Those  nodes  are  important  in  maintaining 
the  normal  functioning  of  the  whole  system,  such  as  in  the  case  of  DTNs  (in  a  mobile 
network)  or  lung  cancer  (in  a  biological  network).  Our  work  presents  first  and  preliminary 
yet  important  insights,  in  terms  of  both  theoretical  results  and  heuristic  algorithms,  into 
the  vulnerability  assessment  of  the  network  community  structure. 

In  an  application  perspective,  our  work  in  this  dissertation  focuses  on  proposing 
novel  community  structure-based  solutions  for  the  following  emerging  problems:  the 
forwarding  and  routing  strategy  in  mobile  networks,  the  worm  containment  problem  in 
social  networks  and  the  limiting  misinformation  spread  in  online  social  networks.  Our 
suggested  strategies  provide  a  significant  improvement  in  terms  of  the  solution  quality 
for  those  mentioned  problems,  and  promise  a  wider  range  of  applications  enabled  by 
dynamic  complex  networks. 
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Complex  network  systems  are  extremely  vulnerable  to  attacks.  In  the  presence  of 
uncertainty,  assessing  network  vulnerability  before  potential  malicious  attacks  is  vital  for 
network  planning  and  risk  management. 

In  this  dissertation,  we  apply  optimization  theory  and  approximation  techniques 
to  address  the  following  fundamental  questions:  How  do  we  quantitatively  measure 
the  vulnerability  degree  of  the  network?  How  to  identify  critical  infrastructures  in  the 
network  in  the  context  of  both  individual  failures  and/or  cascading  failures  that  spread 
from  one  node  to  another  across  the  network  structure?  The  dissertation  provides 
several  new  theoretical  frameworks  and  approximation  algorithms  to  characterize  the 
network  vulnerability  and  critical  infrastructures,  which  advances  the  understanding 
of  network  vulnerability.  The  dissertation  tackles  the  above  questions  by  crossing 
and  contributing  new  techniques  to  several  research  areas  such  as  graph  theory, 
approximation  algorithms,  mathematical  programming,  and  computational  complexity. 

This  research  can  potentially  impact  many  applications  that  benefit  from  networks 
such  as  the  Internet,  smart  grids,  and  transportation  networks  where  vulnerability  is  an 
important  characteristic. 
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CHAPTER  1 
INTRODUCTION 

Assessing  network  vulnerability  before  potential  disruptive  events  such  as  natural 
disasters  or  malicious  attacks  is  vital  for  network  planning  and  risk  management.  It 
enables  us  to  seek  and  safeguard  against  most  destructive  scenarios  in  which  the 
overall  network  connectivity  falls  dramatically. 

There  have  been  numerous  efforts  on  proposing  evaluation  measures  of  the 
network  vulnerability,  as  summarized  in  [46].  On  one  hand,  several  global  graph 
measures,  such  as  Cyclomatic  number,  Maximum  network  circuits,  Alpha  index, 
and  Beta  index,  which  investigate  basic  graph  properties,  i.e.,  number  of  vertices, 
edges  and  pairwise  shortest  paths,  are  adopted  to  evaluate  the  network  vulnerability. 
However,  these  global  measures  can  neither  be  rigorously  mapped  to  the  over 
network  connectivity,  nor  reveal  the  set  of  most  critical  vertices  and  edges,  thus  are 
not  suitable  to  assess  the  network  vulnerability  in  terms  of  connectivity.  On  the  other 
hand,  researchers  focused  on  local  nodal  centrality  [18],  such  as  degree  centrality, 
betweenness  centrality  and  closeness  centrality,  in  order  to  differentiate  the  critical 
vertices  from  the  others,  and  further  evaluate  the  network  by  quantifying  such  vertices. 
Unfortunately,  being  unable  to  cast  these  local  properties  to  global  network  connectivity, 
these  measures  fail  to  indicate  accurate  vulnerabilities  and  cannot  reveal  the  global 
damage  done  on  the  network  under  attacks. 

To  this  end,  in  the  first  part  of  this  proposal,  we  investigate  a  measure  called 
pairwise  connectivity  and  formulate  this  vulnerability  assessment  problem  as  a  new 
graph-theoretical  optimization  problems.  The  pairwise  connectivity  is  the  sum  of  every 
vertex  pair  connectivity,  which  is  quantified  as  1  if  they  are  (strongly)  connected  and  0  if 
not.  Our  new  optimization  problems,  called  /3-vertex  disruptor  and  /3-edge  disruptor,  aim 
to  discover  the  set  of  critical  node/edges,  whose  removal  results  in  the  sharpest  decline 
of  the  pairwise  connectivity.  With  respect  to  a  level  of  connectivity  disruption,  the  more 
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vertices/edges  required  to  be  removed,  the  less  vulnerable  the  network  is;  conversely, 
the  fewer  vertices/edges  needed  to  removed,  the  easier  this  network  is  to  be  destroyed. 
The  /3-disruptor  problems  are  defined  in  section  1.1. 

The  second  part  of  this  proposal  focuses  on  assessing  network  vulnerability  against 
cascading-failures,  that  spread  among  nodes  of  a  power  grid  or  communication  network 
during  a  widespread  outage,  among  financial  institutions  during  a  financial  crisis,  or 
through  a  human  population  during  the  outbreak  of  an  epidemic  disease.  We  develop 
a  new  measurement  for  cascading-resilience  in  networks  subject  to  such  cascading 
failures.  The  cascading-resilience  (a.k.a.  network  vulnerability)  is  measured  as  the  min¬ 
imum  size  of  a  set  of  nodes  that  can  trigger  an  outbreak  of  failure  to  the  whole  network 
in  a  short  amount  of  time.  Thus,  we  formulate  the  measuring  cascading-resilience  as  an 
optimization  problem,  called  cost-effective  massive  outbreak  problem  (CFM). 

Since  all  formulated  optimization  are  shown  to  be  NP-complete,  efficient  algorithms 
to  find  the  exact  solutions  for  the  formulated  problems  are  unlikely  to  exist.  Thus,  we 
focus  on  designing  algorithms  that  can  provide  guarantee  on  their  performances,  which 
are  known  as  approximation  algorithms.  Furthermore,  we  also  devote  one  part  of  the 
proposal  to  design  scalable  algorithms  for  large-scale  networks,  which  have  hundreds  of 
millions  links.  Those  algorithms  are  essential  to  benefit  the  available  of  big  data. 

1.1  Connectivity-based  Vulnerability  Assessment 
1.1.1  Motivation 

Connectivity  plays  a  vital  role  in  network  performance  and  is  fundamental  to 
vulnerability  assessment.  Potential  disruptive  events,  such  as  natural  disasters  or 
malicious  attacks,  which  always  destroy  a  set  of  interacting  elements  or  connections, 
can  dramatically  compromise  the  connectivity  and  result  in  considerate  decline  of  the 
network  QoS,  or  even  breakdown  the  whole  network  [24,  26,  46,  62,  63,  68].  Of  this 
concern,  pre-active  evaluation  over  the  network  vulnerability  with  respect  to  connectivity, 
in  order  to  defense  such  potential  disruptions,  is  quite  essential  and  beneficial  to  the 
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design  and  maintenance  of  any  infrastructure  networks,  for  example,  communication, 
commercial,  and  social  networks. 

1.1.2  /3-disruptor  Problems 

Besides  the  homogeneous  network  model  consisting  of  uniform  nodes  and 
bidirectional  links,  the  heterogeneous  network  model,  where  various  interacting 
elements  of  different  kinds  are  connected  through  unidirectional  links  with  non-uniform 
expenses,  finds  numerous  applications  nowadays  [53,  58,  72],  but  as  well,  exhibits 
multiple  difficulties  for  optimization  and  maintenance.  In  the  light  of  this,  we  abstract  our 
general  network  model  as  a  directed  graph  G(V,  E ),  where  V  refers  to  a  set  of  nodes 
and  E  refers  to  a  set  of  unidirectional  links.  The  expense  of  each  directed  edge  (u,v) 
between  vertex  u  and  v  is  quantified  as  a  nonnegative  value  c(u,v),  for  all  the  m  =  \E\ 
links  among  n  =  \V\  nodes. 

As  mentioned  above,  our  evaluation  over  the  network  vulnerability  is  based  on 
the  value  of  overall  pairwise  connectivity  in  the  abstracted  graph,  which  is  defined  as 
follows:  given  any  vertex  pair  (u,v)  e  V  x  V  in  the  graph,  we  say  that  they  are  connected 
iff  there  exists  paths  between  u  and  v  in  both  directions  in  G,  i.e.,  strongly  connected  to 
each  other.  The  pairwise  connectivity  p(u,  v)  is  quantified  as  1  if  this  pair  is  connected, 

0  otherwise.  Since  the  main  purpose  of  network  lies  in  connecting  all  the  interacting 
elements,  we  study  on  the  aggregate  pairwise  connectivity  between  all  pairs,  that  is,  the 
sum  of  quantified  pairwise  connectivity,  which  we  denote  as  V(G)  =  ^2UtVeVxVp(u,v) for 
graph  G.  Apparently  V{G)  is  maximized  at  (”)  when  G  is  a  strongly  connected  graph. 
Based  on  this,  we  have: 

Definition  1.  (Edge  disruptor)  Given  0  <  P  <  1,  a  subset  S  c  E  in  G  —  (V,  -E1)  is 
a  /3-edge  disruptor  if  the  overall  pairwise  connectivity  in  the  G[E  \  S],  obtained  by 
removing  S  from  G,  is  no  more  than  T(”).  By  minimizing  the  cost  of  such  edges  in  S, 
we  have  the  /3-edge  disruptor  problem,  i.e.,  find  a  minimized  (3 -edge  disruptor  in  a 
strongly  connected  graph  G(  V,  E). 
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Recall  that  G  is  strongly  connected  iff  for  every  vertex  v  in  G,  there  is  a  directed 
path  from  v  to  all  other  vertices.  A  subgraph  of  G  is  called  a  strongly  connected  compo 
nent  (SCC)  iff  it  is  a  maximal  subgraph  of  G  with  all  vertex  pairs  u,  v  within  it  connected 
by  directed  paths  in  both  directions.  Assume  that  a  /3-edge  disruptor  disrupts  the 
connectivity  in  G(V,  E)  by  separating  it  into  several  smaller  SCCs,  say  Ct  for  %  =  1...1 

i 

i.e.  V  =  [jj  Ci.  We  have: 

2—1 


V(G)  =  e(I2I)  =  ^(d«i2-iv'|) 


-  r?j  +  Var(C ) 

1  ‘  _  i  i 

where  Var(C )  =  -  ^(|Ci|  -  C )2  =  -  ^(|C;|  -  y)2.  Therefore,  the  two  key  factors 

2—  1  2—1 

affecting  pairwise  connectivity  are  the  number  of  SCCs  and  the  variance  of  their  sizes. 
They  provide  us  an  alternative  measure  for  evaluating  the  structural  balance  and 
fragmentation  of  the  network. 

Similarly,  we  define  /3-vertex  disruptor  and  its  corresponding  optimization  problem: 
/3-vertex  disruptor  problem:  Given  a  strongly  connected  graph  G(V,E)  and  a 
fixed  number  0  <  /3  <  1,  find  a  subset  S  c  V  with  the  minimum  size  such  that  the  total 
pairwise  connectivity  in  G[V\S],  obtained  by  removing  S  from  G,  is  no  more  than  /3Q). 
Such  a  set  S  is  called  /3-vertex  disruptor. 

1.1.3  Related  Work 

The  classic  vulnerability  measurements  are  mainly  based  on  the  centrality  of 
each  vertex  in  the  graph,  which  consist  of  degree  centrality,  betweenness,  closeness, 
and  eigenvector  centrality  [18].  However,  these  measures  fail  to  indicate  accurate 
vulnerabilities  and  cannot  reveal  the  global  damage  done  on  the  network  under  attacks 
On  the  other  hand,  the  global  graph  measures  are  mainly  functions  of  graph 
properties,  e.g.,  the  number  of  vertices/edges,  operational  O-D  pairs,  operational 
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paths,  minimum  shortest  paths  [24,  46,  62],  However,  some  of  these  attributes  cannot 
be  calculated  in  polynomial-time  for  dense  networks.  In  essence,  these  functions  do 
not  reveal  the  set  of  most  critical  vertices  and  edges,  thus  are  not  suitable  to  assess 
the  network  vulnerability  in  terms  of  connectivity.  Several  similar  concepts  with  our 
pairwise  connectivity  have  been  recently  investigated  in  [10,  17,  73],  where  the  terms 
average  pairwise  connectivity,  pairwise  connected  ratio  and  cohesion  were  used. 
However,  none  of  them  were  able  to  formulate  the  calculation  of  this  measure  as 
an  optimization  problem  and  provide  the  hardness  proof  along  with  performance 
guaranteed  approximation  algorithms.  Moreover,  the  problem  /3-disruptor  studied  in 
this  paper  take  into  account  the  roles  of  all  edges  and  vertices  in  the  global  network 
connectivity,  thus  provides  a  more  essential  research  and  thorough  analysis  over  the 
underlying  vulnerability  framework  established. 

As  a  subproblem  of  this  vulnerability  assessment  problem,  Critical  Vertex/Edge, 
which  are  defined  as  the  minimum  number  of  vertices/edges  whose  removal  disconnects 
the  graph,  are  also  studied  and  solved  using  extensive  heuristics,  however,  without 
performance  guarantee.  Some  work  of  this  kind  in  the  context  of  wireless  network 
are  [44][47][48],  nevertheless,  these  works  consider  only  whether  or  not  the  graph  is 
disconnected  and  ignore  how  fragmental  the  graph  becomes.  They  are  insufficient  to 
evaluate  the  graph  vulnerability. 

Bissias  et  al.  [14,  15]  study  the  problem  of  bounding  the  damage  under  link  attacks. 
However,  the  provided  methods  either  require  solving  costly  semidefinite  programming 
problem  [15]  or  involving  weak  bounds  due  to  the  presence  of  partitions  with  negative 
sizes  [14]. 

1.1.4  Contributions 

Our  contributions  for  the  vulnerability  assessment  research  are  as  follows: 

•  Providing  a  novel  underlying  framework  toward  the  vulnerability  assessment 
by  investigating  the  pairwise  connectivity  and  formulating  it  as  an  optimization 
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problem  /3-disruptor  on  general  graphs,  which  consists  of  two  versions  (3-vertex 
disruptor  and  (3 -edge  disruptor ; 

•  Proving  the  NP-completeness  of  the  two  problems  above  and  further  proving  that 
no  PTAS  exists  for  (3-vertex  disruptor ; 

•  Presenting  an  0( log^  n)  pseudo-approximation  algorithm  for  (3-vertex  dis¬ 
ruptor,  and  an  O(lognloglogn)  pseudo-approximation  algorithm  for  (3-edge 
disruptor.  These  solutions  can  be  applied  to  both  homogeneous  networks 
and  heterogeneous  networks  with  unidirectional  links  and  non-uniform  nodal 
properties. 

•  We  present  a  spectral  lower-bound  method  for  the  link  vulnerability  assessment 
problem,  ,5-edge  disruptor.  The  new  lower-bound  method  is  useful  in  both 
comparing  the  vulnerability  of  different  networks  and  providing  guarantees  for 
other  heuristics  assessment  methods. 

•  In  Chapter  4,  we  present  an  0(x/logn)  bicriteria  approximation  algorithm  for  the 
/3-disruptor  problem.  Since  /3-vertex  disruptor  is  a  special  case  of  /3-disruptor, 
the  algorithm  implies  an  0{yf. logn)  bicriteria  approximation  algorithm  for  /3-vertex 
disruptor,  which  improve  the  best  result  for  /3-vertex  disruptor,  the  O (log  n  log  log  n) 
bicriteria  approximation  algorithm. 

•  In  probabilistic  networks,  We  first  show  that  computing  expected  pairwise 
connectivity  is  #P-complete.  In  addition,  we  develop  a  Fully  Polynomial  Time 
Approximation  Scheme  (FPRAS)  to  estimate  network  connectivity  with  an  arbitrary 
precision. 


1.2  Cascading  Failures  in  Critical  Infrastructures 

Malicious  attacks  can  cause  failures  to  spread  over  the  network.  Such  cascading 
processes  can  be  found  in  contagious  failures  that  spread  among  nodes  of  a  power  grid 
or  communication  network  during  a  widespread  outage,  among  financial  institutions 
during  a  financial  crisis,  or  through  a  human  population  during  the  outbreak  of  an 
epidemic  disease.  During  the  cascade  process,  nodes  are  assigned  states  which 
change  because  of  the  influence  of  their  neighbors.  For  example,  an  infected  node  can 
pass  the  infection  to  its  contacts  in  the  network,  and  the  infection  could  then  be  passed 
to  more  and  more  nodes.  We  focus  on  the  case  where  nodes  change  their  state  only 
when  a  certain  fraction  of  their  neighbors  exert  influence  (see  e.g.  [21 , 81 , 82]). 
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We  develop  a  new  measurement  for  cascading-resilience  in  networks  subject 
to  such  cascading  failures.  The  cascading-resilience  (a.k.a.  network  vulnerability)  is 
measured  as  the  minimum  size  of  a  set  of  nodes  that  can  trigger  an  outbreak  of  failure 
to  the  whole  network  in  a  short  amount  of  time.  Thus,  we  formulate  the  measuring 
cascading-resilience  as  an  optimization  problem,  called  cost-effective  massive  outbreak 
problem  (CFM).  The  key  difference  in  comparison  with  other  works  on  cascading  failure 
and  diffusion  process  is  that  we  consider  the  time-aspect  of  the  outbreak.  We  limit  the 
propagation  of  failure  to  within  d  hops  from  the  failure  sources. 

Both  analytical  analysis  based  on  scale-free  network  theory  and  numerical  analysis 
demonstrate  that  the  massive  outbreak  might  involve  costly  seeding.  To  minimize 
the  seeding  cost,  we  provide  mathematical  programming  to  find  optimal  seeding  for 
medium-size  networks  and  propose  VirAds,  an  efficient  algorithm,  to  tackle  the  problem 
on  large-scale  networks.  VirAds  guarantees  a  relative  error  bound  of  0(1)  from  the 
optimal  solutions  in  power-law  networks  and  outperforms  the  greedy  heuristics  which 
realizes  on  the  degree  centrality.  Moreover,  we  also  show  that,  in  general,  approximating 
the  optimal  seeding  within  a  ratio  better  than  O(logn)  is  unlikely  possible. 

1.2.1  Problem  Definitions 

We  are  given  a  network  modeled  as  an  undirected  graph  G  =  ( V. ,  E)  where 
the  vertices  in  V  represent  users  in  the  network  and  the  edges  in  E  represent  social 
links  between  users.  We  use  n  and  m  to  denote  the  number  of  vertices  and  edges, 
respectively.  The  set  of  neighbors  of  a  vertex  v  e  V  is  denoted  by  N(v)  and  we  denote 
by  d(v)  =  |lV(u)|  the  degree  of  node  v. 

We  continue  with  specifying  the  diffusion  model  that  governs  the  process  of 
cascading  failures.  Existing  diffusion  models  can  be  categorized  into  two  main  groups 
[49]: 

•  Threshold  model.  Each  node  v  in  the  network  has  a  threshold  tv  g  [0, 1],  typically 
drawn  from  some  probability  distribution.  Each  connection  (u,v)  between  nodes  u 
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and  v  is  assigned  a  weight  w(u,  v).  For  a  node  v,  let  F(v)  be  the  set  of  neighbors 
of  v  that  are  already  influenced.  Then  v  is  influenced  if  tv  <  J2u&f(v)  w(u,v). 

•  Cascade  model.  Whenever  a  node  u  is  influenced,  it  is  given  a  single  chance  to 
activate  each  of  its  neighbor  v  with  a  given  probability  p{u,v). 

Most  papers  on  cascading  processes  assume  that  the  probabilities  p(u,v)  or  weights 

w(u,v )  and  thresholds  tv  are  given  as  a  part  of  the  input.  However,  they  are  generally 

not  available  and  inferring  those  probabilities  and  thresholds  has  remained  a  non  trivial 

problem  [43].  Therefore,  in  addition  to  the  bounded  propagation  hop,  we  use  a  simplified 

variation  of  the  linear  threshold  model  in  which  a  vertex  is  activated  if  a  fraction  p  of  its 

neighbors  are  active  as  follows. 

Locally  Bounded  Diffusion  Model.  Let  R0  c  V  be  the  subset  of  vertices  selected 
to  initiate  the  influence  propagation,  which  we  call  the  seeding.  We  also  call  a  vertex 
v  e  R()  a  seed.  The  propagation  process  happens  in  round,  with  all  vertices  in  R0  are 
influenced  (thus  active  in  adopting  the  behavior)  at  round  t  =  0.  At  a  particular  round 
t  >  0,  each  vertex  is  either  active  (adopted  the  behavior)  or  inactive  and  each  vertex’s 
tendency  to  become  active  increases  when  more  of  its  neighbors  become  active.  If  an 
inactive  vertex  u  has  more  than  \p  d(u)]  active  neighbors  at  round  t,  then  it  becomes 
active  at  round  t  +  1,  where  p  is  the  influence  factor  as  discussed  later.  The  process 
goes  on  for  a  maximum  number  of  d  rounds  and  a  vertex  once  becomes  active  will 
remain  active  until  the  end.  We  say  an  initial  set  R0  of  vertices  to  be  a  d-seeding  if  R0 
can  make  all  vertices  in  the  networks  active  within  at  most  d  rounds. 

The  influence  factor  0  <  p  <  1  is  a  constant  that  decides  how  widely  and  quickly  the 
influence  propagates  through  the  network.  Influence  factor  p  reflects  real-world  factors 
such  as  how  easy  to  share  the  content  with  others,  or  some  intrinsic  benefit  for  those 
who  initially  adopt  the  behavior.  In  case  p  =  1/2  the  model  is  also  known  as  the  majority 
model  that  has  many  application  in  distributed  computing,  voting  system  [66],  etc. 

Problem  Definition.  Given  the  diffusion  model,  the  Cost-effective,  Fast,  and 
Massive  outbreak  (CFM)  problem  is  defined  as  follows 
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Definition  2  (CFM  Problem).  Given  an  undirected  graph  G  =  ( V ,  E)  modeling  a  complex 
network  and  an  influence  factor  0  <  p  <  1,  find  in  V  a  minimum  size  d-seeding  i.e.  a 
subset  of  vertices  that  can  activate  all  vertices  in  the  network  within  at  most  d  rounds. 

Generalization.  The  diffusion  model  can  be  generalized  in  several  ways.  For 
example,  the  model  can  be  extended  naturally  to  cover  directed  networks  or  specify 
different  influence  factor  pv  for  each  node  v  e  V.  For  simplicity  we  stick  with  the  current 
model  to  avoid  setting  parameters  during  the  experiments.  Nevertheless,  major  results 
such  as  the  approximation  ratio  of  the  VirAds  algorithm  or  the  hardness  of  approximation 
results  still  hold  for  the  generalized  models. 

1.2.2  Related  Work. 

Outbreak  can  be  thought  of  as  a  diffusion  of  information  about  the  product  and  its 
adoption  over  the  network.  Kempe  et  al.  [49,  50]  formulated  the  influence  maximization 
problem  as  an  optimization  problem.  They  showed  the  problem  to  be  NP-complete 
and  devised  an  (1  -  1/e  -  e)  approximation  algorithm.  A  major  drawback  of  their 
algorithm  is  that  the  accuracy  e,  and  efficiency  depends  on  the  number  of  times  running 
Monte-Carlo  simulation  of  the  propagation  model.  Later,  Leskovec  et  al.  [55]  study  the 
influence  propagation  in  a  different  perspective  in  which  they  aim  to  find  a  set  of  nodes 
in  networks  to  detect  the  spread  of  virus  as  soon  as  possible.  They  improve  the  simple 
greedy  method  to  run  faster.  The  greedy  algorithm  is  furthered  improved  by  Chen  et 
al.  [22]  by  using  an  influence  estimation.  However,  the  proposed  algorithm  might  only 
perform  well  for  small  values  of  propagation  probabilities.  In  addition,  the  algorithm  time 
complexity  should  be  0((m  +  k)  log ri)  instead  of  the  claimed  0(k\ogm  +  m). 

Influence  propagation  with  limited  number  of  hops  is  first  considered  in  Wang  et  al. 
[78,  83]  in  which  the  proposed  heuristic  has  high  time  complexity.  Feng  et  al.  [82]  show 
NP-completeness  for  the  problem.  We  note  that  none  of  the  mentioned  approaches 
handled  large-scale  social  networks  of  million  of  nodes  as  we  shall  study  in  Section  6.4. 
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1.2.3  Contributions 

Our  contributions  are  summarized  as  follows: 

•  Our  first  finding  shows  that  the  seeding  for  fast  and  massive  spreading  must 
contain  a  non-trivial  fraction  of  nodes  in  the  networks,  which  is  cost-prohibitive  for 
large-scale  networks.  This  is  confirmed  by  both  our  theoretical  analysis  based  on 
the  power-law  model  in  [4]  and  our  extensive  experiments. 

•  We  propose  VirAds,  a  scalable  algorithm  to  find  a  set  of  minimal  seeding  to 
expeditiously  propagate  the  influence  to  the  whole  network.  VirAds  outperforms 
the  greedy  heuristics  based  on  well-known  degree  centrality  and  scales  up  to 
networks  of  hundred  of  million  links.  We  prove  that  the  algorithm  guarantees  a 
relative  error  bound  of  0(1),  assuming  that  the  network  is  power-law. 

•  We  show  how  hard  to  obtain  a  near  optimal  solution  for  CFM  by  proving  the 
impossibility  to  approximate  the  optimal  solution  within  a  ratio  better  than  O(logrc). 
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CHAPTER  2 

MULTIPLE  LINK  ATTACKS 

We  convert  the  vulnerability  assessment  into  a  graph-theoretical  optimization 
problem:  finding  a  minimized  set  of  vertices/edges  whose  removal  degrades  the 
pairwise  connectivity  to  a  desired  degree.  Considering  that  disrupting  these  vertices 
and  edges  will  considerately  degrade  the  network  performance,  we  refer  to  them  as 
/3-disruptor  throughout  this  paper,  where  0  <  (3  <  1  denotes  the  fraction  of  desired 
pairwise  connectivity  (which  we  will  define  later).  Two  new  optimization  problems  13- 
vertex  disruptor  and  (3-edge  disruptor  will  be  studied  and  proved  to  be  NP-complete. 

We  addressed  them  with  several  pseudo-approximation  algorithms  with  provable 
performance  bounds,  which  thus  ensure  the  feasibility  and  accuracy  of  this  evaluation 
measure. 

The  benefit  of  our  new  measure  can  be  briefly  illustrated  in  Fig. 2-1 ,  compared  with 
the  assessment  using  degree  centrality.  Notice  that  both  networks  A  and  B  have  7 
vertices  and  are  originally  strongly  connected.  According  to  the  nodal  degree  centrality, 
removing  the  black  vertex  with  maximum  outgoing  degree  5  in  Fig. 2-1 -(a)  leaves 
the  network  A  still  strongly  connected  with  5  vertices;  and  removing  the  black  vertex 
with  maximum  outgoing  degree  4  in  Fig. 2-1 -(b)  partitions  the  graph  into  two  strongly 
connected  components.  In  this  sense,  network  A  is  somewhat  stronger  (less  vulnerable) 
than  B.  However,  our  model  can  discover  that,  deleting  only  the  grey  vertex  in  A  will  be 
enough  to  decrease  the  overall  connectivity  to  0;  on  the  contrary,  at  least  3  vertices  in  B 
are  required  to  be  removed  to  make  overall  connectivity  0.  Therefore,  A  is  actually  much 
more  vulnerable.  Apparently,  our  measure  provides  more  accurate  assessment. 

Furthermore,  our  study  over  the  multiple  disruption  levels  (different  values  of 
(3)  presents  a  deeper  meaning  and  greater  potentials.  Several  recent  studies  in  the 
context  of  wireless  networks  have  aimed  to  discover  the  nodes/edges  whose  removal 
disconnects  the  network,  regardless  of  how  disconnected  it  is  [44][47][48].  Apparently, 
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Figure  2-1.  After  the  “central”  vertex  (in  black)  with  maximum  out-going  degree  is 

removed,  network  (A)  is  still  strongly  connected  while  (B)  is  fragmented; 
however  in  fact,  only  removing  one  vertex  (in  grey)  is  enough  to  destroy 
network  (A). 

this  is  a  weaker  version  of  our  /3-disruptor,  since  no  specification  over  the  quantified 
network  connectivity  is  concerned.  However,  it  is  not  reasonable  to  limit  the  possible 
disruption  to  only  disconnecting  the  graph,  ignoring  how  fragmented  it  is.  For  instance, 
a  scale-free  network  can  tolerate  high  random  failure  rates  [12],  since  the  destruction  to 
boundary  vertices  may  not  significantly  decline  the  network  connectivity  even  though  the 
whole  graph  becomes  disconnected.  In  addition,  different  disruption  levels  may  require 
different  sets  of  disruptor  on  which  our  model  can  differentiate  whereas  existing  methods 
cannot.  For  example,  the  node  centrality  method  always  returns  a  set  of  nodes  with 
non-increasing  degrees  regardless  of  the  disruption  level. 

This  chapter  is  organized  as  follows.  We  first  provide  the  hardness  results  in 
Section  6.3.  The  pseudo-approximation  algorithms  for  /3-edge  disruptor  and  /3-vertex 
disruptor  are  presented  in  Section  2.2  and  Section  3.1  respectively.  We  propose  sparse 
metric  technique  and  a  branch-and-cut  algorithm  to  find  the  optimal  /3-vertex  disruptor  in 
Section  3.3.  Section  3.4  presents  the  simulation  results  comparing  the  performance  of 
the  proposed  approximation  algorithms  and  the  exact  branch-and-cut  algorithm. 

2.1  Complexity  of  Finding  Disruptor 
In  this  section  we  show  that  both  the  /3-edge  disruptor  and  /3-vertex  disruptor  in 
directed  graph  are  NP-complete  which  thus  have  no  polynomial  time  exact  algorithms 
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unless  P  =  NP.  We  state  a  stronger  result  that  both  problems  are  NP-complete  even  in 
undirected  graph  with  unit  cost  edges. 

Note  that  only  in  this  section  we  consider  the  problem  for  undirected  graph 
G{V,E).  All  results  in  other  sections  are  studied  on  directed  graphs,  thus  solving 
both  homogeneous  and  heterogeneous  networks. 

2.1.1  NP-completeness  of  Edge  Disruptor 

We  use  a  reduction  from  the  balanced  cut  problem. 

Definition  3.  A  cut  (S,  V\S)  corresponding  to  a  subset  S  eV  in  G  is  the  set  of  edges 
with  exactly  one  endpoint  in  S.  The  cost  of  a  cut  is  the  sum  of  its  edges’  costs  (or  simply 
its  cardinality  in  the  case  all  edges  have  unit  costs).  We  often  denote  V\S  by  S. 

Finding  a  min  cut  in  the  graph  is  polynomial  solvable  [71].  However,  if  one  asks 
for  a  somewhat  “balanced”  cut  of  minimum  size,  the  problem  becomes  intractable.  A 
balanced  cut  is  defined  as  following: 

Definition  4.  (Balanced  cut)  An  f  -balanced  cut  of  a  graph  G(V,  E),  where  f  :  Z+  ->■  M+, 
asks  us  to  find  a  cut  ( S ,  S)  with  the  minimum  size  such  that  \s\,\s\>f(\v\). 

Abusing  notations,  for  0  <  c  <  we  also  use  c-balanced  cut  to  find  the  cut  (S,  S) 
with  the  minimum  size  such  that  min{|5|,  |5|}  >  c\V\.  We  will  use  the  following  results  on 
balanced  cut  shown  in  [77]: 

Corollary  1.  (Monotony)  Let  g  be  a  function  with 

o  <  g(n)  -  g(n  -  1)  <  1 

Then  f(n )  <  g(n)  for  all  n,  implies  f -balanced  cut  is  polynomially  reducible  to  g- 
balanced  cut. 

Corollary  2.  (Upper  bound)  onT -balanced  cut  is  NP-complete  for  a,  e  >  0. 

It  follows  from  Corollaries  1  and  2  that  for  every  /  =  Vt(ane)  /-balanced  cut  is 
NP-complete.  We  are  ready  to  prove  the  NP-completeness  of  ^-edge  disruptor: 
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G(V,  E) 


Figure  2-2.  Construction  of  H(Vh,Eh)  from  G(V,E ) 

Theorem  2.1 .  (/ 3-edge  disruptor  NP-completeness)  /3-edge  disruptor  in  undirected 
graph  is  NP-complete  even  if  all  edges  have  unit  weights. 

Proof.  We  prove  the  result  for  the  special  case  when  /3  =  \.  For  other  values  of  /3  the 
proof  can  go  through  with  a  slight  modification  of  the  reduction.  We  shall  assume  that  n, 
the  number  of  nodes  is  a  sufficient  large  number  (for  our  proof  n  >  103). 

Consider  the  decision  version  of  the  problem  that  asks  whether  an  undirected  graph 
G(V,  E)  contains  a  ^-edge  disruptor  of  a  specified  size: 

^-ED  =  {(G,  K)  |  G  has  a  ^-edge  disruptor  of  size  A'} 

To  show  that  ^-ED  is  in  NP-complete  we  must  show  that  it  is  in  NP  and  that  all 
NP-problems  are  polynomial  time  reducible  to  it.  The  first  part  is  easy;  given  a  candidate 
subset  of  edges,  we  can  easily  check  in  polynomial  time  if  it  is  a  /3-edge  disruptor  of  size 
K.  To  prove  the  second  part,  we  show  that  /-balanced  cut  is  polynomial  time  reducible 
to  ±-ED  where  /  =  [n~^2^J+-j. 

Let  G(V,  E)  be  a  graph  in  which  one  seeks  to  find  a  /-balanced  cut  of  size  k. 
Construct  the  following  graph  H(yHlEH)\  VH  =  V  u  Cx  u  C2  where  Ci,C2  are  two  cliques 
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of  size  \jf\  ■  Denote  by  TV  =  \VH\  =  2  Li^-J  +  n  the  total  number  of  nodes  in  H.  In  addition 
to  edges  in  G,Ci,  and  C2,  connect  each  vertex  v  e  V  to  L^J  +  1  vertices  in  Cx  and 
L^J  +  1  vertices  in  C2  so  that  degree  difference  of  nodes  in  the  cliques  are  at  most  one. 
We  illustrate  the  construction  of  H(Vh,Eh)  in  Figure  2-2. 

We  show  that  there  is  a  /- balanced  cut  of  size  k  in  G  iff  H  has  an  |-edge  disruptor 
of  size  K  =  n  (  L^J  +  1 )  +  k  where  0  <  k  <  \J^\ .  Note  that  the  cost  of  any  cut  (S,  V\S) 
in  G  is  at  most  \S\\V  \  S\  <  [d^l+FWj  =  |_fj- 

On  one  hand,  an  /- balanced  cut  (S,  S)  of  size  k  in  G  induces  a  cut  (Ci  US,C2U  S) 
with  size  exactly  n  (|_xJ  +  l)  +  k.  If  we  select  the  cut  as  the  disruptor,  the  pairwise 
connectivity  will  be  at  most  |(^). 

On  the  other  hand,  assume  that  H  has  an  |-edge  disruptor  of  size  «  =  »(LtJ  +  1)+ 
k.  Remove  the  edges  in  the  disruptor  to  reduce  the  pairwise  connectivity  to  at  most 
|(^).  Since  cutting  n  nodes  in  Ci  or  C2  from  the  cliques  requires  removing  at  least 
-  n)  >  n  (LtJ  +  l)  +  k  edges,  let  C[  c  C1  and  C'2  c  C2  be  giant  connected 
subsets  that  induce  connected  subgraphs  in  Cx  and  C2.  These  subsets  must  satisfy 
\C[ |  +  \C'2 1  >  | Ci |  +  \C2\  -  n.  Denote  by  X1}X2  the  subsets  of  nodes  in  V  that  are 
connected  to  C\  and  C'2  respectively.  We  have  Xx  n  X2  =  0  otherwise  C\  and  C2  will  be 
connected;  then,  the  pairwise  connectivity  will  exceed  |(^). 

We  will  modify  the  disruptor  without  increasing  its  size  and  the  pairwise  connectivity 
such  that  no  nodes  in  the  the  cliques  are  cut  off  i.e.  we  alter  the  disruptor  until  C[  =  C1 
and  C2  =  C2.  For  each  u  e  Ci  \  C[  remove  from  the  disruptor  all  edges  connecting  u  to 
C[  and  add  to  the  disruptor  all  edges  connecting  u  to  X2.  This  will  attach  u  to  C\  while 
reducing  the  size  of  the  disruptor  at  least  (|_^J  —n)-n.  At  the  same  time  select  an 
arbitrary  node  v  e  X1  and  add  to  the  disruptor  all  remaining  v's  adjacent  edges.  This 
increases  the  size  of  the  disruptor  at  most  ([^J  +  1)  +  n  while  making  v  isolated.  By 
doing  so  we  decrease  the  size  of  the  disruptor  by  ([^J  -  n)  -n  —  (([^J  +  1)  +  n)  >  0. 
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In  addition,  the  pairwise  connectivity  will  not  increase  as  we  connect  u  to  C\  and  at  the 
same  time  disconnect  v  from  C[. 

If  X\  =  0,  we  can  select  v  g  X2  as  in  that  case  \C2  ux2|  >  | C[  UX0  that  makes  sure 
the  pairwise  connectivity  will  not  increase.  We  repeat  the  same  process  for  every  node 
in  C2  \  C2 .  Since  \(Ci  \  C[)  u  (C2  \  C'2)\  <  n,  the  whole  process  finishes  in  less  than  n 
steps  and  results  in  C[  =  C',  and  C'2  =  C2. 

We  will  prove  that  X1  u  X2  =  V  i.e.  (X1,X2)  induces  a  cut  in  G.  Assume  not,  the 
cost  to  separate  Cx  u  X,  from  C2  u  X2  will  be  at  least  (LxJ  +  l)(|V  -  Xi|  +  \V  -  X2\)  = 
(LtJ  +  1)(2ri  _  l^il  _  |X2|)  >  (L^J  +  1  ){n  + 1)  >  n  +  l)  +  k  that  is  a  contradiction. 

Since  X1  u  X2  =  V  we  have  that  the  disruptor  induces  a  cut  in  G.  To  have  the 
pairwise  connectivity  at  most  |(^)  both  (Ci  u  X,)  and  (C2  u  X2)  must  have  size  at  least 
If  follows  that  Xi  and  X2  must  have  size  at  least  f{n)  =  The  cost  of 

the  cut  induced  by  (AT,X2)  in  G  will  be  n  +  l)  +  k  -  n( +  1)  =  k.  □ 

2.1.2  Hardness  of  Approximation:  Vertex  Disruptor 
Theorem  2.2.  /3-vertex  disruptor  in  undirected  graph  is  NP-complete. 

Proof.  We  present  a  polynomial-time  reduction  from  Vertex  Cover  (VC),  an  NP-hard 
problem  [40]: 

Instance:  Given  a  graph  G  and  a  positive  integer  k. 

Question:  Does  G  have  a  VC  of  size  at  most  k? 
to  a  decision  version  of  /3-vertex  disruptor  when  /3  =  0 
Instance:  Given  a  graph  G  and  a  positive  integer  k 
Question:  Does  G  have  a  /3-vertex  disruptor  of  size  at  most  k  when  (3  =  0? 

Pairwise  connectivity  equals  zeros  if  and  only  if  the  complement  set  of  the  disruptor 
is  an  independent  set  or  in  other  words  the  disruptor  must  be  a  VC.  □ 

Theorem  2.3.  Unless  P  =  NP,  (3-vertex  disruptor  cannot  be  approximated  within  a  factor 
of  1.36. 
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Proof.  We  use  the  same  reduction  in  Theorem  2.2.  Assume  that  we  can  approximate 
/3-vertex  disruptor  within  a  factor  less  than  1.36  when  (3  =  0.  In  [34],  Dinur  and  Safra 
showed  that  approximating  VC  within  constant  factor  less  than  1 .36  is  NP-hard.  Since 
we  have  an  one-to-one  mapping  between  the  set  of  vertex  disruptors  when  (3  =  0 
and  the  set  of  VCs,  it  follows  that  we  can  approximate  VC  within  a  factor  less  than  1.36 
(contradiction).  □ 

2.2  Bicriteria  Approximation  Algorithm  for  /3-edge  Disruptor 

In  this  section,  we  present  an  0(log^  n)  pseudo-approximation  algorithm  for  the 
,5-edge  disruptor  problem  in  the  case  when  all  edges  have  uniform  cost  i.e.  c(u,v)  = 

1  V(u,v)  e  E(G).  Formally,  our  algorithm  finds  in  a  uniform  directed  graph  G  a  /3'-edge 
disruptor  whose  the  cost  is  at  most  0(log^  n)OPTp_ED,  where  ^  <  (3  <  (3'  and  OPT p-ED 
is  the  cost  of  an  optimal  /3-edge  disruptor. 

As  shown  in  Algorithm  1,  the  proposal  algorithm  consists  of  two  main  steps.  First, 
we  constructs  a  decomposition  tree  of  G  by  recursively  partitioning  the  graph  into  two 
halves  with  directed  c-balanced  cut.  Second,  we  solve  the  problem  on  the  obtained  tree 
using  a  dynamic  programming  algorithm  and  transfer  this  solution  to  the  original  graph. 
These  two  main  steps  are  explained  in  the  next  two  sections. 

2.2.1  Balanced  Tree-Decomposition 

A  tree  decomposition  of  a  graph  is  a  recursive  partitioning  of  the  node  set  into 
smaller  and  smaller  pieces  until  each  piece  contains  only  one  single  node.  We  show  the 
tree  construction  in  Algorithm  1  (line  1  to  1 1).  Our  decomposition  tree  is  a  rooted  binary 
tree  whose  leaves  represent  nodes  in  G.  (Because  our  decomposition  tree  is  a  binary 
tree  with  n  leaves,  it  will  contain  exactly  n  —  1  non-leaf  nodes.  One  can  prove  this  with 
induction  on  number  of  nodes.) 

Definition  5.  Given  a  directed  graph  G(V,  E)  and  a  subset  of  vertices  S  c  V.  We 
denote  the  set  of  edges  outgoing  from  S  by  S+(S);  the  set  of  edges  incoming  to  S  by 
8~(S).  A  cut  (. S ,  V\S)  in  G  is  defined  as  5+(S).  A  c-balanced  cut  is  a  cut  ( S ,  V\S)  s.t. 
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minds'!,  \V  \  S'!}  >  c\V\.  The  directed  c-balanced  cut  problem  is  to  find  the  minimum 
c-baianced  cut. 

Note  that  a  cut  (S',  V  \S)  separate  pairs  (u,  v)  e  S'  x  (V  \  S')  as  paths  from  v  to  u 
cannot  exist  i.e.  no  SCC  can  contain  vertex  in  both  S  and  V\S. 

The  decomposition  procedure  is  as  follows.  We  start  with  the  tree  T  containing 
only  one  root  node  t0.  We  associate  the  root  node  t0  with  the  vertex  set  V  of  G  i.e. 

V(t0)  =  V(G).  For  each  node  U  e  T  whose  Vfc)  contains  more  than  one  vertex 
and  V(U)  has  not  been  partitioned,  we  partition  the  subgraph  G[V(ti)\  induced  by  V(U) 
in  G  using  a  c-balanced  cut  algorithm.  In  detail,  we  use  the  directed  c-balanced  cut 
algorithm  presented  in  [2]  that  finds  in  polynomial  time  a  c'-balanced  cut  within  a 
factor  of  0(y/logn)  from  the  optimal  c-balanced  cut  for  d  =  ac  and  fixed  constant  a.  The 
constant  c  is  chosen  to  be  1-  y /|d  Create  two  child  nodes  ta,  ti2  of  tt  in  T  corresponding 
to  two  sets  of  vertices  of  separated  by  the  cut.  We  associate  with  u  a  cut  cost 
costiti )  equal  to  the  cost  of  the  c-balanced  cut. 

We  define  the  root  node  t0  to  be  on  level  1.  If  a  node  is  on  level  l,  all  its  children 
are  defined  to  be  on  level  l  +  1.  Note  that  collections  of  subsets  of  vertices  in  G  that 
correspond  to  nodes  in  a  same  level  of  T  induces  a  partition  in  G. 

One  important  parameter  of  the  decomposition  tree  is  the  height  i.e.  the  maximum 
level  of  nodes  in  T.  Using  balanced  cuts  guarantees  a  small  height  of  the  tree  that  in 
turn  leads  to  a  small  approximation  ratio.  When  separating  V(U)  using  the  balanced  cut, 
the  size  of  the  larger  part  is  at  most  (1  -  c')\V(U)\.  Hence,  we  can  prove  by  induction 
that  if  a  node  tt  is  on  level  k,  the  size  of  the  corresponding  collection  V(ti)  is  at  most 
|f/|  x  (1  -  c')k~\  It  follows  that  the  tree’s  height  is  at  most  0{- log^n)  =  O(logra). 

2.2.2  Dynamic  Programming  Algorithm  on  the  Decomposition  Tree 

In  this  section,  we  present  the  second  main  step  which  uses  the  dynamic  programming 
to  search  for  the  right  set  of  nodes  in  T  that  induces  an  cost-efficient  partition  in  G 
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Algorithm  1.  /3-edge  Disruptor 
Input:  Uniform  edges’  weight  directed  graph  G  =  (V,E) 

and  0  <  f3  <  f3'  <  1 

Output:  A  /3'-edge  disruptor  of  G. 

I*  Construct  the  decomposition  tree  7 

1.  c  <-  1  - 

2.  T(Vt,Et)  <r-  ({tO},0),  V{t0)  <r-  V(G%  /(to)  =  1 

3.  while  3  unvisited  t,  with  | V(U)\  >  2  do 

4.  Mark  u  visited,  create  new  child  nodes  ta,  ti2  of  U 

5.  Vt  <—  Vt  U  {ta,  ti2} 

6.  Et  •<—  Et  U  {(tj,  tn),  (ti,  ti2)} 

7.  Separate  G\y{u) ]  using  directed  c-balanced  cut. 

8.  Associate  V(tn),  V(ti2)  with  two  separated  components. 

9.  cost (tj)  -e-  The  cost  of  the  balanced  cut 
/*  Find  the  minimum  cost  G-partitionable  7 

10.  Traverse  T  in  post-order,  for  each  ti  e  T  do 

11.  for  p  0  to  /3'  (!]) 

1 2.  if  V(G[v{ti)])  -  P then  costfc,  p)  <-  0 

13.  else  cost(ti,p )  <—  min{ cost (t ji, pi )+ 

COSt(ti2,p2)  +  COStiti )  I  Pi  +P2  =  p} 

14.  Find  associating  with  T^,pt  =  minp</3,/„\{cosf(t0,p)} 

15.  Return  union  of  c-balanced  cuts  at  U  e  7t(F^,pt). 
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Figure  2-3.  A  part  of  a  decomposition  tree.  F  =  {t2,  t3,  t5,t6}  is  a  G-partitionable.  The 
corresponding  partition  {V(t2),  V(t3),  V(t5),  V(t6)}  in  G  can  be  obtained  by 
using  cuts  at  ancestors  of  nodes  in  F  i.e.  t0,ti,t4. 

whose  pairwise  connectivity  is  at  most  ft (") .  The  details  of  this  step  are  shown  in 
Algorithm  1  (lines  12  to  18). 

Denote  a  set  F  =  {tul,tU2, . . .  ,tUk}  c  VT  where  VT  is  the  set  of  vertices  in  T 

k 

so  that  V  (tui ) ,  V  (tU2 ), ...  ,V  (tUk )  is  a  partition  of  V{G)  i.e.  V(G)  =  [+j  VUh.  We  say 

h=  1 

such  a  subset  F  is  G-partitionable.  Denote  by  A(ti)  the  set  of  ancestors  of  t{  in  T  and 
A(F)  =  [J  A(U).  It  is  clear  that  a  F  is  G-partitionable  if  and  only  if  F  satisfies: 

UeF 

1 .  Vi*,  tj  e  F  :ti  £  A(tj )  and  tj  £  A(U ) 

2.  \/ti  e  VT,  ti  is  a  leaf:  A(ti)  n  F  ^  4> 

In  case  F  is  G-partitionable,  we  can  separate  V(tul),V(tU2), . . .  ,V(tUk)  in  G 
by  performing  the  cuts  corresponding  to  ancestors  of  node  in  F  during  the  tree 
construction.  For  example  in  Figure  2-3,  we  show  a  decomposition  tree  with  a 
G-partitionable  set  { t2,ts,hiU}-  The  corresponding  partition  {V(t2),V(t3),V(t5),V(t6)} 
in  G  can  be  obtained  by  cutting  V(t0),  V(ti),  V(U)  successively  using  balanced  cuts  in 
the  tree  construction.  The  cut  cost,  hence,  will  be  cost(t0)  +  cost^F)  +  cost(t4).  In  general, 
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the  total  cost  of  all  the  cuts  to  separate  V(tul),V(tU2), . . .  ,V(tUk)  will  be: 


cost(F)  =  cost(tu ) 

tueA(F) 

The  pairwise  connectivity  in  G  then  will  be: 

V(F)  =  ^  V(G[v 

tu&F 

We  wish  to  find  F  so  that  V(F)  <  /?'(”)  i.e.  the  union  of  cuts  to  separate  V(tUl),  V(tU2), . . .  ,V(tUk) 
forms  a  /T-edge  disruptor  in  G.  Because  of  the  suboptimal  structure  in  T,  finding  such 
a  G-partitionable  subset  F  in  VT  with  minimum  cost(F)  can  be  done  in  0{ns)  using 
dynamic  programming. 

Denote  cost(t.up )  the  minimum  cut  cost  to  make  the  pairwise  connectivity  in  G[v(u)\ 
equal  to  p  using  only  cuts  corresponding  to  nodes  in  the  subtree  rooted  at  U.  The 
minimum  cost  for  a  G-partitionable  subset  F  that  induces  a  /T-edge  disruptor  of  G  is 
then 

T^,pt  =  min  { cost  (to,  p )} 

p<0'(  2) 

where  t0  is  the  root  node  in  T. 

We  can  easily  derive  the  recursive  formula: 

f  0  if  'P(G[v(ti)])  <  P 

cost(u,p )  =  <  where  tii,  are 

min  cost(tn,  7r)  +  cost(tj2,P  —  vr)  +  cost(ti)  if  not 

V 

children  of  U. 

In  the  first  case,  when  V(G\y^)  <  p  we  cut  no  edges  in  G[V(ti)]  hence,  cost(thp )  = 

0.  Otherwise,  we  try  all  possible  combinations  of  pairwise  connectivity  7 r  in  V(ta)  and 
p  -  7T  in  V(ti2).  The  combination  with  the  smallest  cut  cost  is  then  selected. 

We  now  prove  that  T^pt  <  0(log^  n)Opt3.ED,  where  Opt,.ED  denotes  the  cost  of 
the  optimal  /5-edge  disruptor  in  G. 
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Lemma  1.  There  exists  a  G-partitionable  subset  ofT  that  induces  a  /3'-edge  disruptor 
whose  cost  is  at  most  O  (log^  nj  Optg_ED. 

Proof.  Let  Dp  be  an  optimal  /3-edge  disruptor  in  G  of  size  Opt3_ED  and  Cp  =  {Cx,  C2, Ck} 
be  the  set  of  SCCs,  after  removing  Dp  from  G. 

We  construct  a  G-partitionable  subset  XT  as  in  the  Algorithm  2.  We  traverse  tree 
T  in  preorder  i.e.  every  parent  will  be  visited  before  its  children.  For  each  node  ti}  we 
select  U  into  XT  if  there  exists  some  component  Cj  e  Cp  that  | V(U)  n  Cj \  >  (1  -  c) \V(U)\ 
and  no  ancestors  of  u  have  been  selected  into  XT. 

We  can  verify  that  XT  satisfies  two  mentioned  conditions  of  a  G-partitionable  subset. 

For  each  C3  e  Cp,  define 


N{Cj)  =  {u  e  T  :  \Viti)  nCj\  >  (l  -c)|y(ti)|}. 


Since  V(u),ti  e  T  are  disjoint  subsets.  We  have 

V(XT)  <  Y. 


IV'WI 

x  2 
ti&XT  v 

\  E  E  in*<)i2- 

Cj£Cp  UeNiCj) 


<  - 


< 


E  E 

CjCCp  vtiew(c,) 

5  E  [cmcX  " 


Cj  &cp 


<  -7- 


p  1 

J2 


£ic>f-n)<  ^ 

Cj  £Cp 


Finally  we  show  that  cost(XT )  <  0(log2  n)Opt  Ep.  Let  denote  by  h(T)  the  height  of  T 


and  UT  the  set  of  nodes  at  the  ith  level  in  TG.  We  have: 

h(T} 

cost(XT )  =  E  E  cost(tv 

i=l  tue(LlTnyl(xT)) 


(2-1) 
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If  tu  e  A(Xt)  then  tu  is  not  selected  to  XT.  Hence,  there  exists  Cj  e  C  so  that  | V(tu)  n 
Cj |  <  (1  -c)\V(tu)\  (otherwise  tu  was  selected  into  XT  as  it  satisfied  the  conditions  in  the 
line  3,  Algorithm  2).  To  guarantee  c  <  1  -  c,  we  need  c  <  1/2  i.e.  /?  > 

Since  the  edges  in  Dp  separate  Cj  from  the  other  SCCs,  they  also  separates 
CjCV{tu)  from  V(tu)\Cj  in  G\y(U)\-  Denote  by  S(tu,  Dp)  the  set  of  edges  in  Dp  separating 
Cj  n  V(tu)  from  V(tu)  \  Cj  in  G[y(tu)].  Obviously,  S(tu,Dp)  is  a  directed  c-balanced  cut 
of  G[v(tu)}-  Since,  the  cut  we  used  in  the  tree  construction  is  only  O(v'logn)  times  the 
optimal  c-balanced  cut.  We  have  cost(tu)  <  0(y/logn)\S(tul  Dp)\. 

Recall  that  if  two  nodes  tu,tv  are  on  a  same  level  then  V(tu)  and  V(tv)  are  disjoint 
subsets.  It  follows  that  5(tu,  Dp)  and  S(tv,  Dp)  are  also  disjoint  sets.  Therefore,  the  cut 
cost  at  the  zth  level 


>  ,  cost(tu) 

t^L^nAiXr)) 

<  0(\f\ogn)  ^  \$(tu,Dp)\ 

tv.&(yLiTnA{xT)) 

<  0(v/logn)|  U  5(tu,Dp)\ 

tue(Lyn^(xT)) 

=  0(v/logn)Opt^.ED 

Since  the  number  of  levels  h(T)  =  O(logn),  by  Eq.  2-1  we  have  cost(XT)  < 
0(log5n)Opt^_ED.  □ 

Since  there  exists  a  G-partitionable  subset  of  T  that  induces  a  /3'-edge  disruptor 
whose  cost  is  no  more  than  O  (log^  nj  Optd.ED  as  shown  in  Lemma  1  and  the  dynamic 
programming  always  finds  the  best  latent  solution  in  T,  the  following  theorem  follows. 
Theorem  2.4.  Algorithm  1  achieves  a  pseudo-approximation  ratio  of  O (log ^  n)  for  the 
(3 -edges  disruptor  problem. 

Time  complexity.  Construction  of  the  decomposition  tree  takes  0(?i9-5).  The  major 
portion  of  time  is  for  solving  an  semidefinite  programming  with  D(n3)  constraints.  Finding 
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Algorithm  2.  Find  a  good  G-partitionable  subset  of  T 
that  induces  a  /3'-edge  disruptor  in  G 
Initialization:  Xt  <—  </>;  Preorder-Selection(to)- 
Preorder-Selection(tM) 

1:  if  (3 Cj  e  Cp  :  \V(tu)  nCj\  >  (1  -c)|y(4)|)  then 
2:  XT  •<—  XT  U  {tu} 

3:  else  let  tul,tu2  be  children  of  tu, 

4:  Preorder-Selection (tui) 

5:  Preorder-Selection (tu2) 

6:  end  if 


the  optimal  solution  using  Dynamic  Programming  takes  0(n3).  Hence,  the  overall  time 
complexity  is  0(n9-5). 

2.3  Bounds  on  the  Size  of  Edge  Disruptor 

Simultaneous  attacks  can  cause  devastating  damage,  breaking  down  communication 
networks  into  small  fragments.  To  mitigate  the  risk  and  develop  proactive  responses,  it  is 
essential  to  assess  the  robustness  of  network  in  the  worst-case  scenarios.  In  this  paper, 
we  propose  a  spectral  lower-bound  on  the  number  of  removed  links  to  incur  a  certain 
level  of  disruption  in  terms  of  pairwise  connectivity.  Our  lower-bound  explores  the  latent 
structural  information  in  the  network  Laplacian  spectrum,  the  set  of  eigenvalues  of 
the  Laplacian  matrix,  to  provide  guarantees  on  the  robustness  of  the  network  against 
intentional  attacks.  Such  guarantees  often  cannot  be  found  in  heuristic  methods  for 
identifying  critical  infrastructures.  For  the  first  time,  the  attack-resistant  proofs  of  large 
scale  communication  networks  against  link  attacks  are  presented. 

Connectivity  plays  a  vital  role  in  network  performance  and  is  fundamental  to 
vulnerability  assessment.  The  number  of  connected  node  pairs  in  the  network,  (a.k.a 
pairwise  connectivity),  lends  itself  as  an  effective  measure  to  account  for  the  effect  of  the 
attacks  [1 1 ,  1 4,  1 5,  30,  33,  62,  64]. 
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Vulnerability  assessment  has  been  recently  formulated  as  an  connectivity 
optimization  problem  called  /3-edge  disruptor,  which  finds  a  minimum  cost  links  whose 
removal  causes  a  significant  level  (f3)  of  network  pairwise  degradation  [33].  The  /3-edge 
disruptor  reflects  the  common  sense  that  when  breaking  the  network  by  removing  links, 
the  more  links  required  to  be  removed,  the  less  vulnerable  the  network  is.  The  /3-edge 
disruptor  approach  enables  the  exploration  of  different  network  disruption  levels  which 
can  be  used  to  gain  the  deeper  insight  into  network  structure  and  robustness  in  various 
operating  environments. 

Unfortunately,  the  /3-edge  disruptor  problem  is  NP-hard  [33]  i.e.  there  is  no  efficient 
algorithm  to  solve  the  problem,  unless  P=NP.  A  pseudo-approximation  algorithm  and 
mathematical  approaches  for  the  /3-edge  disruptor  problems  are  introduced  in  [33] 
and  [30],  respectively.  Although  those  methods  can  provide  performance  guarantees, 
they  are  only  applicable  for  small  and  medium  networks  of  few  thousand  nodes.  For 
larger  networks,  we  have  to  rely  on  heuristics  which  can  have  arbitrary  bad  worst-case 
performance.  Hence,  there  is  a  lack  of  methods  to  provide  robustness  proofs  against 
intentional  attacks  for  large  networks. 

In  this  paper,  we  analyze  the  network  spectrum,  the  eigenvalues  of  the  Laplacian 
matrix,  to  give  a  lower-bound  for  the  minimum  size  of  a  /3-edge  disruptor,  thus,  give  a 
certificate  on  the  robustness  of  the  network.  Our  spectral  bound  is  formulated  as  an 
optimization  problem  of  the  Laplacian  eigenvalues,  which  are  known  to  contain  rich 
information  about  the  topological  structure  [23]. 

Since  exact  measurement  for  the  /3-edge  disruptor  is  not  available  in  general,  our 
lower-bound  can  be  coupled  with  upper  bound  methods1  to  narrow  down  the  range 
for  actual  vulnerability/robustness  of  the  network.  We  emphasize  that  while  upper 
bounds  for  /3-edge  disruptor  (or  any  other  minimization  problem)  can  be  designed  easily, 


1  Each  heuristic  to  find  /3-edge  disruptor  is  an  upper  bound  for  the  problem 
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techniques  for  deriving  lower-bound  is  much  scattered  in  literature.  Our  contributions  are 
summarized  as  follows. 

•  We  introduce  a  new  spectral  lower-bound  for  the  /3-edge  disruptor  problem  in  form 
of  an  eigenvalue  optimization  problems.  At  the  same  time,  we  enrich  the  literature 
on  lower-bound  techniques. 

•  We  present  two  efficient  methods  to  compute  the  proposed  lower-bound:  1 )  the 
Lagrange  multiplier  method  and  2)  the  dynamic  programming  algorithm.  Moreover, 
the  Lagrange  multiplier  method  can  derive  the  lower-bound  with  only  a  small 
number  of  smallest  eigenvalues.  This  is  important  for  large  networks  where 
computing  the  whole  network  spectrum  is  both  time  and  memory  consuming. 

•  We  perform  experiments  on  different  network  types  and  real  large-scale  networks 
to  demonstrate  the  quality  of  the  proposed  lower-bound  and  quantify  the 
robustness  of  the  studied  networks  against  intentional  attacks. 

Organization.  We  briefly  present  terminologies  and  problem  definitions  in 
subsection  4.1 .  In  subsection  2.3.2,  we  introduce  the  spectral  lower-bound  for  the 
the  /3-disruptor  problem  together  with  two  methods  to  compute  the  lower-bound. 
Experimental  results  on  different  network  models  and  real  network  instances  are 
obtained  in  subsection  6.4.  Finally,  we  conclude  the  paper  in  subsection  ??. 

2.3.1  Laplacian  Matrix  and  and  Its  Eigenvalues 

We  abstract  our  general  network  model  as  a  graph  G  =  ( V ,  E),  where  V  = 

{v!,v2, . . . ,  vn}  refers  to  a  set  of  nodes  and  E  refers  to  a  set  of  links.  Each  edge  {yh  v3)  e 
E  has  a  removal  cost  c„  >  0  (  and  ci3  =  0  if  (v^Vj)  E).  For  convenience,  we  also 
denote  the  number  of  nodes  and  links  by  n  and  m,  respectively. 

Since  the  main  purpose  of  network  lies  in  connecting  all  the  interacting  elements 
in  the  network,  we  study  on  the  overall  pairwise  connectivity,  which  is  defined  as 
the  number  of  connected  vertex  pairs  in  G.  If  G  is  an  undirected  graph,  a  vertex  pair 
(u,v)  e  V  x  V  is  connected  iff  there  exists  a  path  between  u  and  v.  We  denote  the 
pairwise  connectivity  of  a  graph  G  by  V{G).  Apparently,  the  pairwise  connectivity  is 
maximized  at  (”)  when  G  is  a  (strongly)  connected  graph. 
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Let  A  =  {Cij}  be  the  weighted  adjacency  matrix  and  D  be  the  degree  matrix, 
defined  as  the  diagonal  matrix  with  the  weighted  degrees  di,  d2,...,  dn  on  the  diagonal, 


where  di  =  cn- 

The  unnormalized  graph  Laplacian  matrix  [61]  is  defined  as 


L  =  D-  A 


The  matrix  L  is  symmetric  and  positive  semi-definite,  since  for  every  vector  x  e  Mn  we 
can  verify  that 


(2-2) 


A  direct  consequence  is  that  L  has  n  non-negative,  real-valued  eigenvalues  Ai  <  A2  < 
. . .  <  An.  In  addition,  the  smallest  eigenvalue  of  Ai  is  zero  and  the  corresponding 
eigenvector  is  the  constant  one  vector  1  [61]. 

The  second  smallest  eigenvector  A2  is  known  as  the  algebraic  connectivity  of  the 
graph  and  can  be  used  to  describe  many  properties  of  graphs  [61].  For  example,  the 
graph  G  is  connected  if  and  only  if  A2  >  0.  For  /3-edge  disruptor  problem,  the  following 
lower-bound  can  be  derived  from  A2. 

Lemma  2.  For  any  connected  graph  G,  we  have 


(2-3) 


where  OPT^  denotes  the  minimum  size  of  a  /3-edge  disruptor. 

However,  the  bound  provided  in  Eq.  2-3  is  rather  loose,  as  the  value  of  A2  is  often 
very  close  to  zero  (for  example  when  bridges,  edges  whose  deletion  increases  the 
number  of  connected  components,  are  presented  in  the  networks.)  This  motivates  us  to 
study  higher  eigenvalues  beyond  A2  to  design  stronger  bound  for  the  /3-edge  disruptor 
problem. 
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2.3.2  Spectral  Lower-bound  for  Link  Assessment 

In  this  subsection,  we  derive  a  lower-bound  on  size  of  /3-edge  disruptor  using 
higher  eigenvalues  of  the  Laplacian  matrix  L.  We  first  formulate  the  lower-bound  as  an 
eigenvalue  optimization  problem.  Then  two  methods  with  different  trade-off  between 
time  and  quality  are  introduced  to  compute  the  lower-bound. 

Let  E*p  be  an  optimal  /3-edge  disruptor  and  s*  >  S2  >  . . .  >  s*  be  the  sizes  of  the 
connected  components  after  removing  Et  from  the  network.  Then  we  can  relate  OPT^ 
to  the  size  of  the  components  via  the  following  lemma. 

Lemma  3.  [35]  Let  a  k-partition  of  a  graph  be  a  division  of  the  vertices  into  k  disjoint 
subsets  containing  s1  >  s2  >  .  .  .  >  sk  vertices.  Let  Ecut  be  the  set  of  edges  whose 
two  vertices  belong  to  different  subsets.  Let \\  <  \2  <  ...  <  Xk,  be  the  k  smallest 
eigenvalues  of  the  Laplacian  matrix  plus  any  diagonal  matrix  U  such  that  the  sum  of  all 
the  elements  of  U  is  zero.  Then 


2—1 


Thus,  we  have  OPT^  =  \E*p\  >  \  YJLi  Here  we  allow  imaginary  subsets  of  size 
zero  and  assume  w.I.o.g.  that  k  =  n.  Note  that  s*, . . . ,  s*  are  not  known  without  finding 
Ep.  Thus,  we  consider  all  possible  values  of  {si, . . . ,  sn}  which  infer  network  partitions 
of  pairwise  connectivity  at  most  and  get  the  minimum  of  the  sum  \  Ya=i  as  a 
lower-bound  on  OPT^. 

Formally,  our  spectral  lower-bound  on  OPT/?  is  given  by  solving  the  following 
quadratic  programming  (QP)  optimization  problem. 
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1 

minimize 

2=1 

(2-4) 

n 

subject  to  =  n 

2=1 

(2-5) 

n 

E  (?)  ^  m 

2=1 

(2-6) 

G  {0, 1, . . .  ,n} 

(2-7) 

Theorem  2.5.  Let  Qp  be  the  optimal  objective  of  the  QP  problem  (2-9-2-12)  and  OPTp 
be  the  minimum  f3-edge  disruptor  of  graph  G  =  (V,  E).  Then,Qp  <  OPT^  for  (3  e  [0,1]. 
Moreover,  the  equality  holds  when  /3  =  0  or  (3  =  1 

Proof.  As  discussed  in  the  previous  paragraph,  the  sizes  of  connected  components  after 
removing  optimal  ,5-edge  disruptor  satisfy  all  constraints  (2-5-2-7).  Hence,  Qp  <  OPT^ 
for  all  /3  g  [0, 1].  We  continue  with  the  tightness  of  the  bound  at  extreme  cases  when 

/3  =  0  and  f3  =  1. 

n  1 

Case  f3  =  0:  all  subsets  are  of  size  one.  Hence,  Qi  =  §  ^  A*  =  -Trace(X)  = 

i= 1 

-  (2\E\)  =  \E\.  The  only  way  to  cut  all  pairs  in  the  network  is  to  cut  all  edges.  In  other 
words,  Q0  =  OPT0  =  \E\. 

Case  (3  =  1:  in  order  to  achieve  the  maximum  connectivity  (”),  there  must  be  a 
single  partition  in  the  network  and  the  optimal  disruptor  cutting  no  edges.  That  is  si  =  n 
and  Si  =  0  Vi  >  1.  Since  Ai  =  0,  it  follows  that  Qi  =  0  =  OPTV  □ 

Since  s*  are  integral  values,  we  propose  a  dynamic  programming  algorithm  to 
compute  the  spectral  bound  in  next  subsubsection. 

2.3.2.1  Dynamic  Programming  Method 

We  first  describe  the  optimal  solution  structure  for  the  optimization  problem  in 
(2-9-2-12). 
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Lemma  4.  There  exists  an  optimal  solution  s*  of  QP  (2-4-2-7)  such  that  si  >  s*2>  ...  > 
s* 

Proof.  Let  s*  =  {s^,  ■  ■  ■ ,  s* }  be  an  optimal  solution  of  QP(2-9-2-1 2).  Denote  inv(s*) 

the  number  of  inversions  of  m*  i.e.  such  pairs  of  indices  (i,j)  that  %  <  j  such  that 
s*  >  s*.  If  inv(s*)  =  0,  then  5*1  >  s*2,  >  . . .  >  s*n,  otherwise  there  exists  a  pair  i  <  j 
and  S*  >  S*.  Construct  s'  by  swapping  .s*  and  s*  inside  s*.  Then,  s'  is  a  feasible  solution 
of  QP(2-9-2-12)  and  the  objective  increases  an  amount  s*Xj  +  s*A*  -  (.s*Xt  +  s*Xj )  = 

(s*  -  s*)(Xj  -  X i)  >  0.  Thus,  we  obtain  a  new  optimal  solution  with  less  the  number  of 
inversions.  Repeat  the  process  at  most  (") ,  that  is  the  maximum  number  of  inversions 
in  s*,  we  finally  obtain  an  optimal  solution  with  no  inversions.  That  optimal  solution  shall 
satisfy  the  lemma’s  condition.  □ 


Algorithm  3:  ILB(G,/3) 


1:  Compute  Ai, ...  ,An 

Too, 


2:  Ck(l,p )  = 


if  P  <  Pmin(l,  k) 

X\l  0,  if  p  t  pmax(^,  /c) 

3:  for  k  =  1  to  n 
4:  for  l  —  1  to  n 

5:  for  p  =  pmin(/,  k)  to  min  {f3(D ,pma,x(l,  k)} 

£>.  r  ii  \  ■  f 

b.  4  =min< 

\  £k(J  -  k,p  -  l  T  k)  T  Ylt.=i  Ai 
7:  if  A-!(n,/3©)  =A(n,/3(”)) 

8:  return  \Ck  (n,/3 (”))] 

9:  return  (n,/3( 2))] 


For  k  t  l  <  n  and  p  t  (2) i  define  C,k{i)P )  to  be  the  minimum  spectral  bound 
obtained  by  first  k  subsets  that  the  total  sizes  is  l  and  the  total  pairwise  connectivity  is  at 
mostp.  That  is 

Ck{l’p)  =  A  {s“)7'A<‘) :  ii*(‘)ii'  = E  (2)  s  p}  ■ 

Then  the  optimal  objective  value  QP(2-4-2-7)  shall  be  given  by  Qp  =  £n(n,/3(™)). 
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By  Lemma  13,  we  pay  attention  only  to  partitions  satisfying  sx  >  s2  >  . . .  >  sn.  We 
now  derive  the  recursive  formula  for  £p(l,  fc)  based  on  the  sub-optimal  structure  of  the 
QP  problem.  Consider  two  possible  cases  of  sk 

•  sk  =  0:  There  are  at  most  k  -  1  partitions  whose  sizes  sum  up  to  l.  Hence,  for  this 
case  £(l,  k)  =  £k-X (l,p). 

•  sk  >  0:  Since  s!  >  s2  >  . . .  >  sk  >  0.  Let  s*  =  s*  -  1  >  0,  the  vector 
s  =  {si,  s2,  ■  ■  ■ ,  Sk}  satisfies  simultaneously  the  following 


fc  k 

^  ^  a i  s i  y  ^  A* 

i= 1  i= 1 

k 


y^ysj~k 


i  ~  k 


2=1 


—  Si  +  1 


—  I  +  k  <  p  —  l  +  k 


Therefore,  in  this  case  £k(l,p )  =  £k{l  -  k,p  -  l  +  k)  +  Y^=i  \ 

In  summary,  we  have 


£k(l,p) 


mm 


£k-i(l,p), 

£k(l  -  k,p  -  l  +  k)  +  =i  A, 


We  compute  value  of  £p(l,  k)  in  increasing  order  of  p  and  /  but  in  decreasing  order 
of  k.  The  base  cases  for  £p(l,  k)  are  as  follow. 


£k{l,p) 


TOO,  if  P  <  Pminil,  k) 
Al l  0,  if  p  >  Pmax(^)  fc) 


(2-8) 


where  pmin(Z,  k )  =  (l  mod  k)  +  (fc  -  l  mod  k )  and  pmax(Z,  fc)  =  (')  that  are  the 

minimum  and  maximum  pairwise  connectivity  of  a  graph  with  l  vertices  and  fc  connected 
components,  respectively. 
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Theorem  2.6.  Optimal  solutions  of  QP (2-4-2-7)  can  be  found  in  0(n4)  time  and  0(n3) 
space. 

Thus,  the  spectral  bound  can  be  computed  in  polynomial  time.  However,  the  high 
time  complexity  of  the  dynamic  programming  algorithm  prevents  the  method  from  being 
applied  to  large  networks.  Moreover,  the  dynamic  programming  algorithm  requires 
computing  the  whole  set  of  eigenvalues  of  the  networks,  which  is  both  time  and  memory 
consuming.  We  continue  with  an  approximation  of  the  spectral  bound  that  achieves 
(almost)  the  same  lower-bound  quality  in  significantly  less  time. 

2.3.2.2  Lagrange  Multipliers  Method 

We  relax  the  integral  conditions  on  s*  to  obtain  the  following  relaxation  of  the  QP, 
rewritten  in  vector  notation. 


minimize 

1  sTX 

2 

(2-9) 

subject  to 

||  s  || !  —  n  —  0, 

(2-10) 

IMI2  -  A^  <  0, 

(2-11) 

s  >  0, 

(2-12) 

where  A^  =  j3n(n  -  1)  +  n  and  ||.||  denotes  the  Euclidean  norm. 

The  Lagrange  multiplier  is  then 

=  ^stA  +  x(||s||i -n)  +^{\\s\\22  -  Ap)  -  uT s 

where  uj  =  (cui, . . .  ,cun)  >  0  is  a  positive  multiplier  vector. 

Notice  that  the  problem  is  a  convex  optimization  problem  with  differentiable 
objective  and  constraint  functions  and  it  satisfies  the  Slater’s  condition  with  s  = 

(1, 1, . . . ,  1)T  [19].  Hence,  the  following  Karush-Kuhn-Tucker  (KKT)  conditions  provide 
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the  necessary  and  sufficient  conditions  for  optimality 


Vs£  =  +  x  +  2 ijjs  -  uj 

=  0 

Vx£  =  |  s|  i  —  n 

=  0 

=  || s || 2  -  Ap 

=  0 

ujt  s 

=  0 

>  o 

Algorithm  4:  LMB(G,,5) 

1 

t=\2//3],  Ap  \_j3n(n  —  1)  +  n\ 

2 

Compute  Ai, . . . ,  A t 

3 

for  k  =  1  to  n 

4 

if  k  >  t  then 

5 

t  =  rnin{  2£  n} 

6 

Compute  Ai, . . . ,  A t 

7 

Compute  yj  as  in  Eq.  2-20. 

8 

Compute  and  cf]  as  in  Eqs.  2-21 ,  and  2-22 

9 

if  (V>  >  0  and  cifc)  >  0)  or  (k  =  n)  then 

10:  return  \T>^] 

1 1 :  end  for 

Let  k  =  max{i  |  s*  >  0}.  By  Lemma  13  and  the  complementary  slackness  uTs  =  0, 
we  have  s*  >  0  for  *  <  k,  thus,  s*  =  0  V*  >  k  and  uj  =  0  Vj  <  k. 
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Denote  =  {si,  s2, . . . ,  sfc}  and  A(fc)  =  {Ai,  A2, . 

. . ,  Afc},  the  KKT  condition  can  be 

simplified  to 

VsW£=^A(fc)  +  X  +  20S(fc)  =  O, 

i  <  k 

(2-13) 

V si  £  +  X  0> 

i  >  k 

(2-14) 

VxjC=  |  s(/c)||!  —  n  =  0, 

(2-15) 

V^£=  ||s(fc)||l  -  ^  =  0, 

(2-16) 

s(fc)  >  0,0  >  0,  =  0 

(2-17) 

For  each  value  of  k,  we  can  solve  for  values  of  ,st  and  check  if  all  Si  >  0.  The  other 
unknowns  can  be  found  as  follows.  First,  substitute  the  constraint  (2-15)  into  the  sum  of 
the  constraints  (2-13)  to  obtain  x  in  terms  of  0. 


(2-18) 


Therefore,  we  can  derive  from  (2-13)  as 

W  =  n  /JIAWJJi  _  Al*>\  1 
k  \  4  k  4/0 


(2-19) 


Substituting  the  above  equation  into  the  condition  (2-16)  and  solving  for  0,  we  have 


iW  II 2 


( II  AW  || 

V  16 


A„  =0 

iiv*)ii; 

16  k 


i  n 

02  =  ~  ~k 

1/2 


'||a“,II!-IIa(*)II?M 


A„-¥ 


The  objective  is  then 


2><fc)  =  -s^T A«  =  n 

P  2 


IIA^Hi  ,  ( ||A(fc)||f  ||A(fc)|| 


2k 


+ 


4  k 


=  n- 


l|A«| 

2k 


-  -  ^(l|A<fe)||l  - 


(2-20) 


(2-21) 


45 


Since  Ai  <  A2  <  . . .  <  An,  Eq.  2-19  implies  that  s[k)  >  >  ...  >  s^\  Hence,  in  order 

to  satisfy  s(fc)  >  0,  it  is  sufficient  that 


(2-22) 


Theorem  2.7.  The  size  of  a  /3-edge  disruptor  is  lower-bounded  by 


mm 

n>k>n2  /  Ap 


where  vf]  and  are  given  by  Eqs.  2-21  and  2-22. 

The  steps  to  solve  the  relaxation  of  the  QP  is  summarized  in  the  Algorithm  4  (LMB 
Algorithm). 

Time  complexity.  The  LMB  algorithm  spends  its  major  time  on  computing  the 
eigenvalues.  This  can  be  done  with  Implicitly  Restarted  Lanczos  Method  which  has 
worst-case  time  complexity  0(mKh+nK2h+K3h)  where  K  is  the  number  of  eigenvalues 
to  be  computed,  and  h  is  the  number  of  iterations  for  the  eigenvalue  algorithm  to 
converge  [80].  Given  the  eigenvalues,  the  rest  of  LMB  takes  only  0{n )  time  in  the 
worst-case. 

The  number  of  required  eigenvalues  K  is  small  in  our  algorithm.  At  beginning, 
the  algorithm  computes  t  =  \2 //3]  smallest  eigenvalues  and  the  number  of  computed 
eigenvalues  is  double  each  time  if  necessary.  In  our  experiments,  the  number  of  needed 
eigenvalues  is  2//3  in  most  cases.  For  example,  to  bound  the  number  of  necessary 
links  whose  removal  disrupts  90%  pairwise  connectivity  we  only  need  to  compute  about 
20  smallest  eigenvalues  of  the  Laplacian  matrix.  We  found  the  LMB  algorithm  to  be 
scalable,  taking  linear  time  with  respect  to  the  number  of  nodes  and  edges. 

2.3.2.3  Time  and  quality  trade-off 

On  one  hand,  the  ILB  algorithm  (Algorithm  3)  provides  a  better  bound  than  that  of 
the  LMB  algorithm.  The  reason  is  that  ILB  solves  for  exact  solutions  of  the  QP  while 
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Figure  2-4.  Minimum  cost  and  lower-bounds  for  /3-disruptor  on  the  synthesis  networks 


A  Erdos-Reyni  (random)  network  B  Barabasi  (power-law)  network 


C  Watts-Strogatz  network 


Figure  2-5.  Running  time  on  the  synthesis  networks 


LMB  only  targets  a  relaxation  of  the  QP.  However,  the  difference  between  the  output  of 
two  algorithms  is  negligible  small  and  either  zero  or  one  2  in  our  experiments. 

On  the  other  hand,  the  LMB  has  much  more  practical  time  complexity.  The  ILB 
has  high  time  complexity  0(n4)  and  can  only  applied  for  network  up  to  few  thousand 
nodes.  In  contrast,  LMB  takes  only  linear  time  to  compute  its  competitive  bound.  Overall 
for  small  and  medium  networks,  one  can  apply  ILB  algorithm  (or  other  mathematical 
approaches  [30])  to  compute  the  lower-bound,  however,  for  large  networks  LMB  remains 
the  only  choice. 
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2.3.3  Experimental  Results 

We  compute  our  spectral  lower-bound  for  both  synthetic  and  real-world  networks 
and  compare  the  results  with  the  optimal  results  whenever  possible. 

2.3.3.1  Synthetic  Networks 

We  generate  the  synthetic  networks  following  well-known  complex  network  models. 
All  networks  have  100  nodes  and  around  300  edges.  The  details  of  those  networks  are 
as  follows. 

•  Erdos-Reyni:  A  random  graph  of  1 00  vertices  and  300  edges  following  the 
Erdos-Reyni  model  [36]. 

•  Barabsi-Albert:  A  power-law  model  using  preferential  attachment  mechanism 
[12]. 

•  Small  world:  A  random  graph  following  Watts  and  Strogatz  model  [79].  The 
dimension  of  the  lattice  is  set  to  be  3  and  the  rewiring  probability  is  0.3. 

The  optimal  solutions  are  found  with  the  integer  programming  using  the  sparse  metric 

technique  in  [30].  The  technique  in  [30]  is  also  applied  to  compute  the  lower-bound 

given  by  solving  the  linear  programming.  The  results  produced  by  ILB  and  LMB 

algorithms  are  identical  (after  rounded  up)  and  plotted  under  the  same  name  “spectral 

bound”.  All  algorithms  were  run  on  a  PC  with  Intel  Xeon  2.93  Ghz  processor  and  12  GB 

memory.  The  integer  programming  (IP)  and  the  linear  programming  (LP)  are  solved  with 

the  mathematical  optimization  package  GUROBI  4.5. 

The  minimum  number  of  links  whose  removal  causes  certain  level  of  disruption,  are 

shown  in  Fig.  4-4.  For  all  three  different  networks,  solving  LP  gives  good  lower-bound 

on  the  minimum  number  of  links  to  remove.  The  spectral  bounds  are  much  worse  than 

the  LP  bounds  in  the  random  and  small-world  networks;  however,  the  spectral  bound 


2  Both  algorithms  round  up  their  results  to  the  nearest  integers. 
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closely  approaches  the  LP  bounds  and  the  optimal  solution  when  the  network  has  the 
power-law  topology  of  the  Barabasi  model. 

As  shown  in  Fig.  4-5,  there  is  a  big  gap  between  the  running  time  of  the  spectral 
bound  and  those  of  LP  and  IP.  Note  that  all  the  spectral  bound  are  computed  at  once, 
i.e.,  the  provided  running  time  is  the  total  running  time  over  all  different  values  of  (3.  Even 
though  the  running  time  of  the  spectral  bound  is  still  thousand  of  times  faster  than  LP 
and  IP. 

Overall,  while  IP  is  best  used  for  small  networks,  and  LP  can  be  used  for  medium 
networks  of  few  thousand  nodes,  the  only  feasible  method  to  compute  the  lower-bound 
in  large  networks  is  the  spectral  bound.  One  of  the  attractive  aspect  of  the  LMB  spectral 
bound,  described  in  the  Alg.  2,  is  that  the  algorithm  can  be  easily  implemented  in  a 
distributed  manner.  The  most  time-consuming  part  of  the  algorithm  is  to  compute  the 
few  smallest  eigenvalues.  This  can  be  done  distributedly  with  the  existing  mathematical 
software  [16]. 

Table  2-1 .  Sizes  of  the  investigated  networks  and  the  corresponding  running  time  to 
compute  the  lower-bound 


CAIDA  AS 

Oregon  AS 

P2P  Gnutella 

Vertices 

8,020 

11,174 

22,663 

Edges 

36,406 

23,410 

109,  386 

Time  (s) 

1530.1 

321.0 

207.9 

2.3.3.2  Real-world  Datasets 

We  compute  the  spectral  lower-bounds  for  real  networks  are  shown  in  Fig.  2-6. 
Neither  LP  nor  IP  can  run  on  these  networks  due  to  both  time  and  memory  limits.  The 
studied  networks  are 

•  Gnutella  P2P:  Gnutella  peer-to-peer  network  from  from  Aug.  25,  2002  [56].  Nodes 
represents  hosts  in  the  network  and  edges  are  the  connections  between  the 
Gnutella  hosts. 

•  Oregon  AS:  AS  peering  information  inferred  from  Oregon  route-views  between 
Mar.  31  and  May  26,  2001  [56], 
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•  CAIDA  AS:  The  CAIDA  AS  Relationships  Datasets,  from  September  17,  2007  [56]. 

The  lower-bounds  in  Fig.  2-6  indicates  that  it  is  difficult  to  destroy  major  connectivity 
in  communication  networks.  For  examples,  even  after  removing  369  links  at  least  50% 
node  pairs  in  the  CAIDA  AS  network  stay  connected;  and  to  bring  down  the  connectivity 
level  in  the  Gnutella  P2P  network  to  15%  one  has  to  destroy  at  least  960  links.  Due  to 
low  edge  density,  the  Oregon  AS  network  tends  to  be  more  vulnerable  than  the  other 
two  networks.  Nevertheless,  uterly  disrupting  the  connectivity  in  the  network  to  5%  level 
would  require  removing  more  than  763  links. 


Remaining  Pairwise  Connectivitity  -  (3  (percent) 

Figure  2-6.  Lower  bounds  on  the  number  of  link-attack  for  real  networks  found  with  the 

LMB  algorithm. 
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CHAPTER  3 

MULTIPLE  NODE  ATTACKS 

3.1  Bicriteria  Approximation  Algorithm  for  /3-vertex  Disruptor 

We  present  a  polynomial  time  algorithm  (Algorithm  3)  that  finds  a  /^'-vertex  disruptor 
in  the  directed  graph  G(V,E)  whose  the  size  is  at  most  O (log  n  log  log  n)  times  the 
optimal  /3-vertex  disruptor  where  0  <  (3  <  /3'2.  The  algorithm  involves  in  two  phases.  In 
the  first  phase,  we  split  each  vertex  v  e  V  into  two  vertices  v+  and  v~  while  putting  an 
edge  from  v~  to  v+  and  show  that  removing  v  in  G  has  the  same  effects  as  removing 
edge  (v+  ->■  v~ )  in  the  new  graph.  In  the  second  phase,  we  try  to  decompose  the 
new  graph  into  SCCs  capping  the  sizes  of  the  largest  component  while  minimizing  the 
number  of  removed  edges.  We  relax  the  constraints  on  the  size  of  each  component  until 
the  set  of  cut  edges  induces  a  ^'-vertex  disruptor  in  the  original  graph  G. 

Given  a  directed  graph  G(V,  E)  for  which  we  want  to  find  a  small  ^'-vertex  disruptor, 
we  split  each  vertex  in  G  into  two  new  vertices  to  obtain  a  new  directed  graph  G'(V',  E') 
where 


V'  =  {v~,v+\veV} 

E'  =  {(iT  ^v+)\veV} 

U  {(w+  — »  v~ )  \(u^-v)eE} 

The  new  graph  G'(V',  E’)  will  have  twice  the  number  of  vertices  in  G  i.e.  \V’\  —  2\V\  = 
2 n.  An  example  for  the  first  phase  is  shown  in  Figure  3-1 . 

We  set  the  costs  of  all  edges  in  E'v  =  {(v_  ->  v+)  \  v  e  V}  to  1  and  other 
edges  in  E'  to  +oo  so  that  only  edges  in  E'v  can  be  selected  in  an  edge  disruptor  set. 
In  implementation,  it  is  safe  to  set  the  costs  of  edges  not  in  E'v  to  0(n )  noting  that  by 
paying  a  cost  of  2 n  we  can  effectively  disconnect  all  edges  in  E'v. 

Consider  a  directed  edge  disruptor  set  D'e  c  E'  that  contains  only  edge  in  E'v. 

We  have  a  one-to-one  correspondence  between  D'e  to  a  set  Dv  =  {v  \  ( v~  w+)  e 


51 


Algorithm  5.  /^'-vertex  disruptor 

lnput:Directed  graph  G  =  ( V. ,  E)  and  fixed  0  <  ft  <  1. 

Output:  ^'-vertex  disruptor  of  G 

1  .G'(V',£')  <-(&<£) 

2.  \/veV:V'  V'U{v+,v~} 

3. Vv  eV  :  E'  ^  E'U{(v~  ^  v+)},  c(iT,  v+)  «-  1 

4.  V(w  — >  v)  G  E  :  E1'  G-  E'  U  {w+  — >  v-},  c(w+,  f_)  G-  oo 

5.  1 

6.  Dy  G-  17(G) 

7.  while  (/?  -  (3  >  e)  do 

8.  /^L^Jxe 

9.  Find  £>e  c  E'  to  separate  G'  into  strongly  connected 
components  of  sizes  at  most  p\V'\  using  algorithm  in  [37] 

10.  Dv  G-  {v  G  V(G)  |  (v+  ->  tr)  G  DJ 
11-  if  V(G[V\Dv])<PQ)  then 

12.  P  =  j3 

13.  Remove  nodes  from  Dv  as  long  as  V(G[V\Dv])  <  /3(”) 

14.  if  \DV\  >  \DV\  then  Dv  =  Dv 

15.  else/3  =  /3 

18.  end  while 

19.  Return  Dv 


D'e}  in  G(V,  E)  which  is  a  vertex  disruptor  set  in  G.  Since  G  and  G'  have  different 
maximum  pairwise  connectivity,  (n~1)n  for  G  and  (2n~1)2n  for  G',  the  fractions  of  pairwise 
connectivity  remaining  in  G  and  G'  after  removing  Dv  and  D'e  are,  however,  not  exactly 
equal  to  each  other. 

In  the  second  phase  of  Algorithm  3,  when  separating  a  graph  into  SCCs,  the 
smaller  the  sizes  of  SCCs,  the  smaller  pairwise  connectivity  in  the  graph.  However, 
the  smaller  the  maximum  size  of  each  SCC,  the  more  edges  to  be  cut.  We  perform 
binary  search  to  find  a  right  upper  bound  for  size  of  each  SCC  in  G'.  In  the  algorithm, 
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the  lower  bound  and  upper  bound  of  the  size  of  each  SCC  are  denoted  by  (3\V'\  and 
]3\V'\  respectively.  At  each  step  we  try  to  find  a  minimum  capacity  edge  set  in  G'{V',  E') 
whose  removal  partitions  the  graph  into  strongly  connected  components  of  size  at 
most  (3\V'\,  where  (3  =  x  e-  We  round  the  value  of  (3  to  the  nearest  multiple  of 

e  so  that  the  number  of  steps  for  the  binary  search  is  bounded  by  log  K  The  problem 
of  finding  a  minimum  capacity  edge  set  to  decompose  a  graph  of  size  n  into  strongly 
connected  components  of  size  at  most  pn  is  known  as  p-separator  problem.  We  use 
here  the  algorithm  presented  in  [37]  that  for  a  fixed  e  >  0  finds  a  p-separator  in  directed 
graph  G  whose  value  is  at  most  O  (^.  log  n  log  log n)  times  Opt(p_e)_separator  where 
Opt(/9_e)-separator is  the  cost  the  °Ptimal  (p  -  e)-separator.  Finally,  we  derive  the  cut 
vertices  in  G  from  the  cut  edges  in  G'  to  obtain  the  /7-vertex  disruptor. 

Lemma  5.  Algorithm  3  always  terminates  with  a  [3' -vertex  disruptor. 

Proof.  We  show  that  whenever  (3  <  f3'  then  the  corresponding  Dv  found  in  Algorithm  3 
is  a  /7-vertex  disruptor  in  G.  Consider  the  edge  disruptor  D'e  in  G'  induced  by  Dv.  We 
first  show  the  mapping  between  SCCs  in  G[V\Dv]  and  SCCs  in  G'[E'\D'e\,  the  graph 
obtained  by  removing  D'e  from  G'.  Partition  the  vertex  set  V  of  G  into:  (1)  Dv :  the  set 
of  removed  nodes  (2)  Vsingie:  the  set  of  nodes  that  are  not  in  any  cylcle  i.e.  they  are 
SCCs  of  size  one  {3)Vconnected:  union  of  remaining  SCCs  that  sizes  are  at  least  two,  say 
T connected  dll  Cj ,  \Ci\  ^  2.  Vertices  in  Vconnecied  belong  to  at  least  one  cycle  in  G. 

We  have  following  corresponding  SCCs  in  G'[E'\D'e ]: 

~\v  g  Dv  gg  SCCs  {v+}  and  {z Since  after  removing  (v-  -»■  v+)  v+  does  not  have 
incoming  edges  and  v~  does  not  have  outgoing  edges. 

2v  g  Vsingie  SCCs  {v+}  and  {v-}.  Since  v  does  not  lie  on  any  cycle  in  G.  Assume 
v+  belong  to  some  SCC  of  size  at  least  2  i.e.  v+  lies  on  some  cycle  in  G'.  Because 
the  only  incoming  edge  to  v+  is  from  v~.  It  follows  that  v~  is  preceding  v+  on  that 
cycle.  Let  u~,u+  be  the  successive  vertices  of  u+  on  that  cycle.  We  have  u  and  v 
belong  to  a  same  SCC  in  G  which  yields  a  contradiction.  Similarly,  v~  cannot  lie  on 
any  cycle  in  G'. 
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3SCC  Ci  c  Vconnected  <h-  SCC  C[  =  {v~,v+  |  v  e  Q}.  This  can  be  shown  using  a 
similar  argument  to  that  in  the  case  v  e  Vsingie. 

Since  D'e  is  a  /3-separator,  the  sizes  of  SCCs  in  G'[E'\D'e\  are  at  most  fi  2 n.  It 
follows  that  the  sizes  of  SCCs  in  G[V\C g  are  bounded  by  fin.  Denote  the  set  of  SCCs  in 
G[v\Dv]  by  C  with  the  convention  that  vertices  in  Dv  become  singleton  SCC  in  G[V\Dv]. 
Therefore,  we  have: 


This  guarantees  that  the  binary  search  always  finds  a  ^'-vertex  disruptor  and  completes 


the  proof. 


□ 


Theorem  3.1 .  Algorithm  3  always  finds  a  /31  -vertex  disruptor  whose  the  size  is  at  most 
Oflogn  log  log  n)  times  the  optimal  6 -vertex  disruptor  for  f/2  >  [3  >  0. 

Proof.  It  follows  from  the  Lemma  5  that  Algorithm  3  terminates  with  a  ^'-vertex  disruptor 
Dv.  At  some  step  the  capacity  of  Dv  equals  to  the  capacity  of  /3-separator  D'e  in  G' 
where  fi  is  at  least  /3'  -  e  according  to  Lemma  5  and  the  binary  search  scheme.  The 
cost  of  the  separator  is  at  most  O  (log  n  log  log  n)  times  the  Opf(/3_e)-separator  usin9  the 
algorithm  in  [37], 

Consider  an  optimal  (/3'2  -  9e)-vertex  disruptor  D'v  of  G  and  its  corresponding  edge 
disruptor  D'e  in  G'.  Denote  the  cost  of  that  optimal  vertex  disruptor  by  Opt(^2_9c).vD- 
there  exists  in  G[V\Dv]  a  SCC  Ci  so  that  \Ci\  >  ((3'  -  2 e)n  then  V{GyV\Dvfi  >  |((/3'  - 
2 e)n  -  2)((/3'  -  2 e)n  -  1)  >  (/3/2  -  9e)(2)  when  n  >  20(/^+1).  Hence,  every  SCC  in  G'[V^DI] 
have  size  at  most  (/3'  -  2e)(2n)  i.e.  D'e  is  an  (/?'  -  2e)-separator  in  G'.  It  follows  that 
Opt^,2_9e)-VD  >  °Pt(/3'-2e)-separator in  G'- 
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(a)  (b) 

Figure  3-1 .  Conversion  from  the  node  version  in  a  directed  graph  (a)  into  the  edge 
version  in  a  directed  graph  (b) 

Since  8  -  e  >  ft  -  2e,  we  have  Opt^_e).separator  <  Opt(/3/_2e)_separator  - 

Opt{/J/2_9e).vD- 

The  size  of  the  vertex  disruptor  \DV\  =  \D'e\  is  at  most  O  (log  n  log  log  n)  times 
Opt^_e)-separator-  Thus,  the  size  of  found  ^'-vertex  disruptor  Dv  is  at  most  O (log  n  log  log  ri) 
times  the  optimal  (/ 3 '2  -  9e)-vertex  disruptor.  As  we  can  choose  arbitrary  small  e,  setting 
(3  =  /3'2  -  9e  completes  the  proof.  □ 

Time  complexity.  Finding  the  separator  costs  0(n9)  [37],  Hence,  the  total  time 
complexity  is  0(log  hi9).  However,  in  our  experiments,  the  algorithm  takes  much  less 
than  its  worst-case  running  time. 

3.2  Connection  between  Edge  Disruptor  and  Vertex  Disruptor 

We  show  that  an  approximation  algorithm  for  general  directed  edge  disruptor 
yields  an  approximation  algorithm  for  directed  vertex  disruptor  with  (almost)  the  same 
approximation  ratio. 

Lemma  6.  A  /3-edge  disruptor  set  in  the  directed  graph  G’  induces  the  same  cost 
f3 -vertex  disruptor  set  in  G. 

Proof.  We  use  Dv  and  D'e  for  vertex  disruptor  in  G  and  edge  disruptor  in  G'. 

Given  V[G'[E'\D'e ])  <  /3(22n)  we  need  to  prove  that:  V(G[V\Dv\)  <  /3(f)  where 
n  =  \V\. 


u+  v+ 
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Assume  G[V\Dv]  has  l  SCCs  of  size  at  least  2,  say  Chi  =  l . .  .1.  The  corresponding 
SCCs  in  G'[E'\D'e\  will  be  C\,i  =  1 . . .  /  where  \C[\  =  2\Ci\. 

Since  -Xty  -  -id-  =  ,  1 T  >  0,  for  all  0  <  k  <  n.  We  have 

(  2  )  (2)  (n-l)ra(2n-l)  —  ’  —  — 


nG  lv\Dv]) 
(  n\ 


< 


<P 


□ 


Lemma  7.  A  /3 -vertex  disruptor  set  in  G  induces  the  same  cost  (/3  +  e)-edge  disruptor 
set  in  G’  tor  any  e  >  0. 


Proof.  We  use  the  same  notations  in  the  proof  of  Lemma  6.  Given  V(G[V\dv])  <  0©  we 
need  to  prove  that:  V(G'[E'  D'])  <  (J3  +  e)(22n).  We  have: 

V(G’IE’\D'C}) 

(?) 

y-  |Cil(«  ^  IQI)  -P(G[v\d.]) 

(n  —  l)n(2n  —  1)  (™) 

_  nv\)  f1  _  1  A  ,  Ell  \G%\ 

(2)  V  2n  -  1/  n(2n  -  1) 

<  /3  +  - - -  <  /3  +  e  (3-1 ) 

2n  —  1 


when  n  >  L^J  +1-  □ 

Theorem  3.2.  Given  a  factor  f  (n)  polynomial  time  approximation  algorithm  for  /3 -edge 
disruptor,  there  exists  a  factor  (1  +  e)f(n)  polynomial  time  approximation  algorithm  for 
/3 -vertex  disruptor  where  e  >  0  is  an  arbitrary  small  constant. 


Proof.  Let  G  be  a  directed  graph  with  uniform  vertex  costs  in  which  we  wish  to  find  a 
/3-vertex  disruptor.  Construct  G'  as  described  at  the  beginning  of  this  Section. 

Apply  the  given  approximation  algorithm  to  find  in  G'  a  /3-edge  disruptor,  denoted  by 
D'e,  with  the  cost  at  most  fin)  •Opt/3_ED(G')!  where  Opt/3_ED(G')  is  the  cost  of  a  minimum 
/3-edge  disruptor  in  G'.  From  Lemma  6,  D'e  induces  in  G  a  /3-vertex  disruptor  Dv  of  the 
same  cost.  We  shall  prove  that 


Opt/S-ED^’  )  <  Optfl-VD^)  +  70; 
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where  Opt/j_VD(G)  is  the  cost  of  a  minimum  /3-vertex  disruptor  in  G  and  70  is  some 
positive  constant.  It  follows  that  the  cost  of  Dv  will  be  at  most 


f(n)  •  (Opt/3_VD(G)  +  A0)  <  (1  +  e)/(n)Opt^_VD(G) 


Here,  we  assume  that  Opt/3_VD(G)  >  f  otherwise  we  can  find  Opt/3_VD(G)  in  time 
0(n?+2). 


From  an  optimal  /3-vertex  disruptor  of  G,  construct  its  corresponding  edge  disruptor 

D*e  in  G'.  If  V(G'[E  \  D*e]  <  /3(22n)  then  Opt/3_ED(G')  <  Opt(3_VD(G)  and  we  yield  the 
proof.  Thus,  we  consider  the  case  V(G'[E  \  D*e\  >  /3(22n). 

Among  SCCs  of  G'[E  \  D*\,  there  must  be  a  SCC  of  size  at  least  /32n  or  else 


vertices  from  that  SCC. 


G'[E  \  D*e }  <  (d_1(/32n)  <  /3(22n)  (contradiction).  Remove  70  = 

The  pairwise  connectivity  in  G'[E  \  D *]  will  decrease  at  least  (/32n  -  T)i  =  2n  -  -k  >  n 
for  sufficient  large  n.  From  Eq.  3-1  in  Lemma  7,  the  pairwise  connectivity  after  removing 
vertices  will  be  less  than 


(/3  + 


1 

2n  —  1 


Therefore,  after  removing  at  most  70  vertices  from  D*e,  we  get  a  /3-edge  disruptor. 
Hence, 

<  Opf/3— VD (f-O  +  70-  D 


3.3  Branch-and-cut  Algorithm 

Branch-and-cut  methods  have  proven  to  be  a  very  successful  approach  for  solving  a 
wide  variety  of  integer  programming  problems.  In  contrast  with  meta-heuristics,  they  can 
guarantee  optimality.  They  combine  a  branch-and-bound  algorithm  with  a  cutting  plane 
method  that  is  used  to  improve  the  solution  of  the  linear  programming  relaxations. 

This  section  presents  components  of  our  branch-and-cut  algorithm.  We  begin 
with  a  new  lightweight  mixed  integer  programming  formulation  for  /3-vertex  disruptor 
in  Subsection  3.3.2.  In  the  next  subsection,  we  introduce  a  new  class  of  strong  cutting 
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planes  and  the  separation  procedure  to  find  such  cutting  planes.  The  primal  heuristics 
that  provides  upper  bounds  for  pruning  during  the  search  process  is  presented  in 
Subsection  3.3.4. 

3.3.1  Mixed  Integer  Programming  Formulation 

We  model  the  network  as  an  undirected  graph  G  =  (V,  E)  of  n  nodes  numbered 
from  1  to  n]  the  degree  of  node  1  <  %  <  n  is  denoted  by  d(i).  The  pairwise  connectivity 
of  G,  denoted  by  V(G)  is  the  number  of  node  pairs  with  at  least  one  path  between  them. 
For  example,  if  G  is  connected,  then  V{G)  =  (") . 

Given  a  positive  constant  0  <  /3  <  1,  a  subset  of  vertices  S  c  V  in  G  is  a  (3-vertex 
disruptor  if  the  subgraph  G[V\S],  induced  by  V  \  S  in  G,  has  pairwise  connectivity  at  most 
^Q).  The  ,5-edge  disruptor  problem  asks  to  find  a  ,5-vertex  disruptor  of  the  minimum 
size. 

The  problem  can  be  generalized  so  that  each  node  u  e  V  has  a  cost  w{u) 
of  removing  and  we  wish  to  find  a  /3-vertex  disruptor  of  the  minimum  cost.  This 
generalization  is  straightforward  and  shall  be  ignored  to  simplify  the  presentation. 

The  IP  formula  for  /3-vertex  disruptor  (IPvd)  is  as  follow 


n 


minimize  ^  Sj 

i=l 

(3-2) 

subject  to  dij  <  Si  +  Sj , 

(hj)  e  E, 

(3-3) 

dij  djk  ^  dikj 

\HX  j 

(3-4) 

Y,d„>(l-P)(nX 

i<j  ^  ' 

(3-5) 

—  dij , 

i  ±  j 

(3-6) 

Si,  dij  G  {0,1}, 

i,J  e  [l..n] 

(3-7) 
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We  use  variable  dV]  to  represent  the  “distance”  between  a  pair  of  nodes  %  and  j  in  the 
residual  network  i.e. 


0  if  i  and  j  are  in  the  same  connected  component 


1  otherwise. 


An  extra  variable  s,  is  used  for  each  node  i  e  V,  where 


0  if  node  i  is  not  removed 


1  if  i  is  removed  (selected  into  the  disrnptor.) 


The  objective  minimizes  the  total  number  of  removed  nodes  i.e.  the  size  of  the 
vertex  disruptor.  Note  that  di3  =  dji  V(i,j)  e  V  x  V.  Constraint  (4-3)  is  the 
well-known  triangle  inequality  which  implies  that  if  i  and  j  are  connected,  and  j  and 
k  are  connected,  then  i  and  k  must  be  connected.  Constraint  (4-4)  limits  the  pairwise 
connectivity  in  G  to  be  at  most  f3( f). 

Constraint  (4-2)  implies  the  base  case  that  if  i  and  j  are  neighbors  and  neither  i  or 
j  is  removed  (s*  =  s3  =  0),  then  i  and  j  remain  connected  i.e.  di3  =  0.  Constraint  (4-5) 
states  the  fact  that  a  removed  node  will  not  connect  to  any  other  nodes  [si  =  1  ->•  dl3  = 

1)- 

There  are  several  drawbacks  with  the  IP  formula  of  the  ^-vertex  disruptor  problem 
(IPvd)  (and  also  formulations  of  k- CND  and  k- CED  [10]).  A  large  number  of  integral 
variables,  0(n2),  makes  the  selection  of  branching  difficult  and  significantly  increases 
the  depth  and  size  of  the  search  tree.  In  addition,  excessive  number  of  constraints, 
0(?z3),  even  for  small  sized  instances  leads  to  a  large  linear  programming  relaxation  that 
consumes  an  extremely  large  amount  of  memory  and  computing  time. 

3.3.2  Sparse  Metric  Technique 

We  first  devise  a  new  Mixed-Integer  Programming  (MIP)  formulation  for  the  /3-vertex 
disruptor  problem  that  consists  of  only  n  integer  variables  and  much  smaller  number 
of  constraints.  Since  the  only  role  of  triangle  inequalities  is  to  guarantee  dt]  to  be  a 
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pseudo-metric  (as  defined  later  in  the  proof  of  Theorem  3.3),  we  introduce  a  compact 
subset  of  inequalities,  so-called  sparse  metric,  that  guarantees  the  same  pseudo-metric 
property.  When  the  network  is  sparse  i.e.  \E\  oc  \V\,  the  number  of  constraints  reduces 
substantially  from  @(n3)  to  @(n2). 

Our  new  MIP  formulation  for  the  /3-vertex  disruptor  problem  (MIPvd)  is  similar  to  IPvd 
except  in  places  of  constraints  (4-3)  and  (5-9)  as  presented  below. 

d>ij  T  djk  ■'>  k  G  T/n i n ( i ■  j )  (3  8) 

dij  G  [0, 1],  *,j  G  [l..n],  (3-9) 

where  Nmin(i,j )  is  the  set  of  neighbors  of  i  excluding  j  if  d(i)  <  d(j),  and  Nmin(i,j)  is 
the  set  of  neighbors  of  j  excluding  i,  otherwise.  We  also  drop  the  integral  requirements 
on  d^  i.e.  replace  the  constraints  dtJ  g  {0, 1}  with  dtj  g  [0, 1].  However,  the  integrality  of 

si,  s2, . . . ,  sn  remains. 

Note  that  there  are  exactly  n  integer  variables  si,  s2, . . . ,  sn.  In  addition,  the  number 
of  constraints  is  upper  bounded  by 

,  r  ,/  x  ,/  xx  n(n  —  1) 

\E\  +  5^  mm  {d(i),d(j)}  + - - - 

i<j 

2  d(i)  d(j )  2  ri  —  1  > x  ...  .  . 

<rr  +  - - - =  n  H - - —  ^  d(i)  =  0{jnn) 

i<j  i= 1 

Hence,  the  number  constraints  is  substantially  less  than  0(n3)  for  complex  networks  that 
are  often  sparse. 

We  proceed  to  prove  the  equivalence  of  the  compact  formulation  MIPvd  to  IPvd  by 
showing  the  following 

•  The  integrality  constraints  on  dtJ  \/i,j  are  in  fact  redundant  (Proposition  3.1). 

•  The  optimal  solutions  of  MIPvd  also  induce  optimal  /3-vertex  disruptor  in  G 
(Theorem  3.3). 
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•  The  optimal  fractional  solution  of  LP  relaxation  of  IPvd  can  be  found  by  solving  the 
(smaller)  LP  relaxation  of  MIPvd,  following  by  an  0(mn  +  n2logn )  tuning  procedure 
(Theorem  3.4). 

Proposition  3.1 .  For  every  optimal  solution  of  MIPvd,  there  is  a  feasible  solution  of  the 
MIP  with  the  same  objective  value  in  which  all  variables  are  integral. 

Proof.  Round  all  dl3  >  0  to  1.  This  will  not  violate  constraints  (4-5)  and  (4-4).  For 
constraints  (4-2),  if  di3  is  rounded  up  to  1  then  the  integrality  of  .s8,  s3  implies  s*  +  s3  >  1, 
or  else  if  dl3  =  0  then  the  constraints  are  still  satisfied.  Assume  the  rounding  violates 
constraints  (3-8)  for  some  triple  (i,j,  k ).  This  happens  if  and  only  if  dik  =  1  and 
dij  =  djk  =  0.  Hence,  before  rounding,  dlk  >  0  and  dl3  =  djjk  =  0  that  contradicts  the 
constraint  dl3  +  djk  >  dik.  It  follows  that  rounding  gives  a  feasible  integral  solution  to  the 
MIP.  □ 

Let  VMip  =  {  i  |  Si  =  1}  be  the  disruptor  induced  by  the  optimal  solution  of  MIPvd 
and  OPT^d  be  an  optimal  /3-vertex  disruptor. 

By  setting  s{  =  0  V/  g  OPT^d  and  dl3  =  0  for  all  i,j  in  a  same  connected  component 
of  G[V\OPTp  j  and  di3  =  1  if  not,  we  yield  a  feasible  solution  for  MIPvd.  Therefore, 

IT^mipI  <  |OPT^d| 

Theorem  3.3.  The  optimal  solution  VMW  =  {  i  \  sr  =  1}  obtained  by  solving  MIPvd  is  a 
minimum  (5 -vertex  disruptor  of  G. 

Proof.  Since  |£>Mip|  <  |OPT^d|,  we  only  need  to  show  that  £>Mip  is  a  ^-vertex  disruptor. 

Assume  that  we  can  prove  that  dl3  =  0  for  every  connected  pairs  (i,j)  in  G[i/\dmip]. 
Then,  only  disconnected  pairs  <%/  will  contribute  to  the  sum  in  constraint  (4-4).  Since 
di'f  <  1  Vi,  j  e  [l..n\,  the  number  of  disconnected  pairs  must  be  at  least  (1  -  (3)(™).  It 
will  follow  that  VMip  is  a  /3-vertex  disruptor. 

Hence,  the  rest  of  the  proof  is  to  show  that  dl3  =  0  for  every  connected  pairs  (i,  j)  in 

G[V\Dmip]- 
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Note  that  d  is  a  pseudo-metric,  i.e.,  the  function  d(i,j)  =  di3  satisfy: 

•  Non-negativity:  d(i,j)  >  0 

•  Identity:  d(i,i)  =  0 

•  Symmetry:  d(i,j)  =  d(j,i ) 

•  Subadditivity:  d(i,j)  <  d(i ,  k)  +  c(k,j). 

For  each  connected  pair  (i,j)  in  G[V\dmip],  we  prove  that  di3  =  0  by  induction  on  the 
length  t  of  the  shortest  path  (in  number  of  hops)  between  nodes  i  and  j. 

The  basis.  The  statement  holds  for  t  =  1.  By  constraint  (4-2),  if  (i,j)  e  E  and  i,j 
are  connected  in  G  i.e.  s*  =  s3  =  0,  then  d%3  <  +  s3  =  0.  Since  dl3  >  0,  it  follows  that 

d/y  0. 

The  inductive  step.  Assume  that  the  statement  holds  for  t  =  t',  we  show  that 
the  statement  is  also  true  for  t  =  t'  +  1.  Let  i,j  be  some  pairs  connected  with  a  path 
of  length  at  most  t'  +  1.  Since  removing  all  nodes  in  Nmin(i,j)  disconnects  i  from  j, 
the  path  between  i  and  j  must  pass  through  some  node  k  e  In  addition, 

the  shortest  paths  from  i  to  k  and  from  k  to  j  have  lengths  at  most  t'.  Thus,  by  the 
induction  hypothesis  we  have  dik  =  dkj  =  0.  It  follows  from  the  constraint  in  (3-8)  that 
dij  <  dik  +  dkj  =  0.  Thus,  the  statement  holds  for  all  t  >  0.  □ 

Finally,  we  show  the  relationship  between  the  LP  relaxation  of  IPvd  and  that  of 
MIPvd. 

Theorem  3.4.  The  optimal  solution  of  the  LP  relaxation  IPvd  can  be  found  by  solving  the 
LP  relaxation  of  Ml Pvd,  following  by  an  0(mn  +  n2  log  n)  tuning  procedure. 

Proof.  Let  (s,  d)  be  an  optimal  fraction  solution  of  the  LP  relaxation  of  MIPvd.  Associate 
a  weight  d%3  for  each  edge  (i,j)  e  E.  Let  d'l3  be  the  shortest  distance  between  two  nodes 
(i,j)  with  the  new  edge  weights.  We  have 

•  d'i3  >  d^  for  all  i,j  and  d'i3  =  dl3\f(i,j)  e  E. 

•  d^  =  min k=1{d'ik  +  d'kj}.  Hence,  d'l3  is  a  pseudo-metric. 
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The  first  statement  can  be  shown  by  the  same  induction  in  the  proof  of  Theorem  1 .  The 
second  statement  comes  from  the  definition  of 

LJ 

Furthermore,  we  define  d*:)  =  min  {/IT,  1}.  If  we  use  the  Johnson’s  algorithm  [27] 
to  compute  all  pairs  shortest  paths  rfT,  the  time  complexity  to  construct  d*3  from  d{j  is 
0(mn  +  n2  log n).  We  shall  prove  that  (s,  d*)  is  a  feasible  solution  of  IPvd  by  showing  that 
{s,d*)  satisfies  all  constraints  in  IPvd. 

By  definition,  we  have  d*tj  =  minjdT,  1}  >  min {<%,  1}  =  dt]  \/i,j  and  d*3  =  dt]  V(i,j)  e 
E.  Thus,  for  all  (i,j)  e  E,  d*3  =  dt]  <  s{  +  sj. 

In  addition,  d*  is  also  a  pseudo-metric  as  d*k  +  d*kj  >  min  {d'ik  + d'kj,l}  >  min  {/IT,  1}  = 
d*j.  From  d*tj  >  dlv  we  have  d*j  >  di,j  >  £©  and  <  dij  <  d*3-  Thus,  (s,  d*)  is 

a  feasible  solution  of  IPvd. 

Obviously,  the  minimum  objective  of  the  LP  relaxation  of  MIPvd  is  smaller  or  equal 
to  that  of  IPvd.  Since,  the  objective  values  associate  with  (s,d*)  and  (s,d),  a  minimum 
solution  of  the  LP  relaxation  of  MIPvd,  are  the  same,  (s,d*)  must  be  a  minimum  solution 
of  the  LP  relaxation  of  IPvd.  □ 

3.3.3  Cutting  Planes 

We  present  a  class  of  strong  cutting  planes  together  with  the  separation  procedure 
to  identify  those  cutting  planes.  These  can  be  used  in  conjunction  with  cutting  planes 
generated  automatically  by  optimization  packages  to  improve  the  convergence  of  the 
branch-and-cut  algorithm. 

3.3.3.1  Vertex-Connectivity  and  Invalid  Inequalities 

One  often  overlooked  characteristic  of  solutions  for  clustering  and  partitioning 
problems  on  graph  is  that  clusters  must  induce  connected  subgraph.  This  characteristic 
is  not  reflected  in  either  IPvd  or  MIPvd  formulations. 

A  subset  S  c  V  is  a  vertex-cut  for  a  pair  (u,  v),  if  removing  5  from  graph  G, 
disconnect  s  and  t.  For  all  vertex-cut  S  of  {u,v),  if  =  l^l.  then  duv  must  be  one. 
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Thus,  we  have  VC  inequality 


^  ^  dUv  |^|  1 

i£S 

This  inequality  is  valid  for  all  feasible  points  inside  the  polyhedra  of  MIPvd. 


Algorithm  6.  Separation  procedure  for  VC  inequalities 
1 :  for  each  pair  (it,  v)  e  V  x  V  do 
2:  Construct  a  flow  network  G  =  ( V. ,  E)  as  follows 

3:  Assign  u  and  v  as  source  and  sink,  respectively 

4:  Each  node  k  eV  has  capacity  sk 

5:  Every  edge  has  capacity  oo 

6:  if  (■ u ,  v)  e  E,  then  (u,  v)  has  capacity  zero. 

7:  Find  the  maximum-flow  (min-cut) 

8:  if  maximum-flow  is  less  than  duv,  then 

9:  Find  the  min  vertex-cut  S 

10:  Add  the  VC  inequality  associated  with  S  to  MIP 

8:  end  if 

1 1 :  end  for 


3.3.3.2  Separation  Procedure  for  VC  Inequalities 

Given  a  point  (fractional  solution)  (s,d)  e  an  exact  separation  algorithm 

for  some  class  of  inequalities  either  finds  a  member  of  the  class  violated  by  (s,  d),  or 
proves  that  no  such  member  exists.  In  many  cases,  finding  such  algorithm  is  intractable 
(NP-hard  problem)  and  one  has  to  settle  for  heuristic  procedures.  Fortunately,  there  is 
an  exact  algorithm  for  our  separation  procedure  based  on  finding  the  max-flow  on  the 
network  with  node  capacities. 

The  VC  inequality  can  be  rewritten  as 

^  Si  -  duv  >0,  Sis  any  vertex-cut  of  (u,  v) 

i£S 

where  s*  =  1  -  s*  and  duv  =  1  -  duv. 
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Algorithm  7.  Sharpest  Decreasing  Vertices  (SDV) 

1 :  Start  with  some  /3-vertex  disruptor  V  c  V. 

2:  Repeat 

3:  while  (true)  do 

4:  u  =  axgmin{7?(G[v\(DuM)])} 

vET> 

5:  if  (V  \  {«}  is  a  /3-disruptor)  then  V  =  D  \  {«} 

6:  for  each  (w  e  V  \  V)  do 

7 :  Dw  —  V  U  {w} 

8:  u  =  argmin{'P(G[n(Z)u,uW)])} 

v€lT> 

9:  +if  (u  ±  w)  then  V  =  Dw  u  {u} 

10:  Until  {V  not  changing) 

11:  Output  D. 

Therefore,  the  point  (s,  d)  violates  this  inequality  if  and  only  if  Y,i&s  ^  <  duv 
The  most  violated  inequality  is  the  one  that  minimize  the  sum  )>T&s  st,  given  S'  is  a 
vertex-cut  of  (u,v).  Thus,  the  subset  S  corresponding  to  the  most  violated  inequalities 
can  be  found  using  minimum  capacitated  vertex-cut  of  (u,v).  The  separation  procedure 
is  described  in  Algorithm  1 .  Here,  we  need  to  solve  the  maximum-flow  (min-cut) 
problem  in  networks  with  both  node  and  edge  capacities.  If  we  apply  Push-relabel 
algorithm  with  dynamic  trees  [42],  the  time  complexity  to  find  cutting  planes  for  one 
node  pair  is  0(mn log  ^).  The  total  time  complexity  for  the  separation  procedure  will  be 
0(n3m  log  ^).  In  our  implementation,  this  procedure  is  called  sparingly  in  order  to  avoid 

excessive  running  time. 

3.3.4  Primal  Heuristic 

The  search  for  an  optimal  solution  in  a  branch  and  cut  algorithm  can  be  accelerated 
by  obtaining  a  high  quality  feasible  solution  to  provide  upper  bounds  for  pruning 
other  subproblems.  We  present  a  heuristic  that  rounds  the  fractional  solution  of  MIP 
relaxations  to  get  integral  solutions. 

Let  (s,  d)  be  a  fractional  solution  of  an  LP  relaxation.  We  first  sort  st  in  non-decreasing 
order  sh  <  si2  <  ...  <  sin.  Then  we  round  down  all  sh,si2,  ...,sik  to  zero  and  round 
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up  sik+1 ,  sik+2 , . . . ,  sin  to  one,  where  k  runs  from  1  to  n.  If  the  obtain  solution  is  a  /3-vertex 
disruptor,  a  local  search  method  described  in  Algorithm  2  is  then  used  to  refine  the 
solution.  The  local  search  method  refines  the  solution  by  repeatedly: 

•  Removing  node(s)  from  the  disruptor  if  possible 

•  Swapping  a  node  w  outside  the  disruptor  with  a  node  u  in  the  disruptor  that  gives 
the  sharpest  decrease  in  connectivity. 

The  local  search  terminates  when  no  improvement  exists. 

3.4  Experimental  study 

We  perform  experiments  to  find  out  the  gap  between  the  solution  of  the  pseudo 
approximation  algorithm  (Algorithm  3)  and  an  optimal  solution  found  by  solving  an 
Integer  programming  formulation.  We  generate  two  types  of  network:  random  networks 
following  Erdos-Renyi  model  and  power-law  networks  following  Barabasi-Albert  model. 
For  each  type  of  network,  we  generate  different  instances  with  number  of  nodes  ranging 
from  30  to  100.  Edge  densities  of  generated  networks  are  around  10%.  The  machine 
used  for  the  experiments  was  an  8  cores  2.2  Ghz  equipped  with  64  GB  memory. 

Size  of  disruptors  found  by  Algorithm  3  and  the  size  of  optimal  disruptors  are 
presented  in  Tables  3-1  and  3-2.  Despite  a  large  theoretical  gap  of  the  pseudo 
approximation  algorithm,  the  algorithm  produces  near-optimal  solutions  and  returning 
optimal  solutions  in  more  than  half  places  (marked  with  bold  numbers). 

Especially,  our  algorithm  performs  extremely  well  on  power-law  networks.  It  misses 
the  optimal  solution  in  only  one  place  when  the  number  of  vertices  is  90.  Between  a 
random  network  and  a  power-law  network  of  roughly  same  sizes,  the  size  of  disruptor 
in  the  power-law  network  is  significantly  smaller  (approximately  50%)  than  that  in  the 
random  network,  showing  extremely  high  degree  of  vulnerability  of  power-law  network  to 
attacks  [7], 
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Table  3-1 .  Size  of  disruptor  on  Erdos-Renyi  networks  at  60%  connectivity. 


Vertex 

30 

40 

50 

60 

70 

80 

90 

100 

Edge 

43 

78 

122 

177 

241 

316 

400 

495 

Optimal 

2 

4 

7 

9 

11 

12 

16 

18 

Approx 

3 

4 

8 

9 

11 

13 

16 

19 

Table  3-2.  Size  of  disruptor  on  Barabasi-Albert  networks  at  60%  connectivity. 


Vertex 

30 

40 

50 

60 

70 

80 

90 

100 

Edge 

54 

131 

189 

208 

245 

262 

354 

445 

Optimal 

1 

3 

5 

6 

6 

5 

7 

9 

Approx 

1 

3 

5 

6 

6 

5 

10 

9 

Figure  3-2.  Disruptors  found  by  different  methods  in  the  Western  States  Power  Grid  of 
the  United  States  at  different  levels  of  disruption. 
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C  30%  connectivity  D  10%  connectivity 

Figure  3-3.  Disruptors  found  by  different  methods  in  the  Western  States  Power  Grid  of 
the  United  States  at  different  levels  of  disruption. 
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The  running  time  for  solving  the  Integer  programming  increases  from  few  minutes  to 
10  hours  for  the  largest  test  cases,  while  in  the  longest  run,  the  pseudo-approximation 
algorithm  takes  only  29  seconds. 

3.4.1  Performance  of  the  Branch-and-cut  Algorithm 


Vertex 

Edge 

P 

Removed 

Time  (seconds) 

Constraint 

vertex 

IPvd 

MIPvd 

IPvd 

MIPvd 

50 

141 

60.0% 

4 

63 

8 

60,  167 

4,  861 

150 

286 

1 .0% 

18 

19,  788 

2 

1, 

665,  362 

31, 887 

- 

- 

5.0% 

15 

18,  070 

7 

- 

32,  161 

- 

- 

8.0% 

12 

n/a 

73 

- 

33,  242 

- 

- 

10.0% 

11 

n/a 

1 , 363 

- 

39,  615 

- 

- 

20.0% 

9 

n/a 

1 , 737 

- 

39,  313 

- 

- 

40.0% 

7 

n/a 

2,  149 

- 

42,  830 

- 

- 

60.0% 

5 

n/a 

1, 610 

- 

38,  458 

- 

- 

90.0% 

2 

26,  277 

147 

- 

34,  321 

200 

387 

60.0% 

8 

n/a 

64,  860 

3, 

960,  488 

72,  980 

600 

1,  166 

0.5% 

69 

n/a 

48,  918 

107, 

641 , 467 

516,  656 

1000 

1, 959 

0.5% 

198 

n/a 

747 

499, 

340,  027 

1 , 437,  326 

Table  3-3.  Comparisons  of  IPvd  and  MIPvd  on  power-law  networks 


We  implement  our  branch  and  cut  algorithm  using  GUROBI  4.0  on  a  computer  with 
Intel  Xeon  2.93  Ghz  processor  and  12  GB  memory.  Table  3-3  shows  results  for  IPvd  and 
our  new  branch  and  cut  algorithm  (MIPvd)  on  power-law  networks  [12]  of  various  sizes. 
We  report  for  each  disruption  level  (3,  the  number  of  removed  vertices  in  the  optimal 
solution,  the  number  of  Rows  (constraints),  Nonzeros  (nonzero  coefficients),  and  solving 
time. 

As  shown  in  Table  3-3,  our  branch-and-cut  algorithm  utilizing  sparse  metric 
technique  and  strong  cutting  planes  is  substantially  faster  and  more  memory-efficient 
than  the  original  branch-and-cut  equipped  in  GUROBI  MIP  solver.  The  speed  up  factor 
is  from  8  times  for  50  nodes  to  several  thousand  times  for  larger  instances.  For  the 
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network  of  150  nodes,  MIPvd  often  takes  less  than  30  minutes,  while  IPvd  runs  out  of 
memory  or  does  not  terminate  after  100,000  seconds  (noted  with  n/a). 

3.4.2  Case  study:  Western  States  Power  Grid 

We  study  a  network  of  4941  nodes  and  6594  edges  representing  the  topology 
of  the  Western  States  Power  Grid  of  the  United  States.  The  network  is  shown  to  be 
high  clustering  with  small  characteristics  path  lengths  [79];  hence  the  network  is  rather 
vulnerable  to  targeted  attacks. 

It  is  intractable  to  find  the  optimal  disruptor  using  Integer  Programming  for  such  a 
large  network.  Our  approximation  algorithm  uses  row-generation  technique  to  reduce 
excessive  amount  of  constraints  and  runs  on  a  clusters  of  20  nodes,  each  node  is 
equipped  with  an  8  cores  2.2  Ghz  CPU  and  64  GB  memory. 

We  compare  the  attack  schemes  that  target  nodes  based  on  their  centrality  with  our 
pseudo  approximation  algorithm  to  show  that  those  methods  might  not  be  suitable  to 
reveal  network  vulnerability  in  term  of  overall  network  connectivity.  Compared  methods 
include 

1.  Degree  Centrality.  The  algorithm  sequentially  remove  node  with  the  maximum 
degree  until  the  pairwise  connectivity  in  the  graph  less  than  /3Q). 

2.  Betweenness  Centrality.  We  repeatedly  remove  the  node  with  maximum 
betweenness  centrality,  until  the  pairwise  connectivity  in  the  graph  less  than 
^(2).  Recall  that  the  betweenness  Bt(v )  for  node  v  is:  Bt(v )  =  Y,  s^v^tEV 

where  ast  is  the  number  of  shortest  paths  from  s  to  t,  and  ast(v )  is  the  number  of 
shortest  paths  from  s  to  t  that  pass  through  a  node  v . 

3.  Eigenvector  Centrality.  Nodes  are  removed  in  descending  order  of  their  Eigenvector 
centrality  (Pagerank)  values  with  the  default  damping  factor  of  85%  as  in  [65]. 

We  show  in  Figure  3.4  vulnerability  reported  by  different  methods  at  various  levels 

of  disruption.  The  network  is  surprisingly  vulnerable  to  targeted  attacks.  For  example 

to  reduce  40%  connectivity  in  the  network  (60%  connectivity  remain)  we  only  need  to 

destroy  0.16%  stations.  Bringing  down  the  connectivity  to  the  same  level,  the  average 

number  of  nodes  to  remove  for  random  networks  and  power-law  networks  are  13% 
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and  3%  respectively.  Even  destroying  only  1%  of  stations  can  dramatically  disrupt  90% 
connectivity  in  the  network. 

None  of  other  methods  can  reveal  correctly  the  vulnerability  of  the  power  grid.  Their 
disruptor  sizes  are  6  to  20  times  larger  than  those  of  our  approximation  algorithm.  Thus, 
using  alternative  assessment  methods  rather  than  the  ones  we  proposed  might  lead  to  a 
dangerous  mirage  that  the  network  is  strongly  stable. 

Because  of  high  clustering  property,  nodes  that  lie  among  clusters  in  the  networks 
will  often  have  high  betweenness  values.  Intuitively,  we  expected  the  betweenness 
method  to  easily  identify  those  nodes  and  perform  well  in  the  experiment.  Surprisingly, 
the  performance  of  betweenness  method  turns  out  to  be  even  worse  than  that  of  degree 
centrality. 

We  visualize  the  network  fragmentation  at  varied  disruptive  levels  in  Figures 
3-3C  and  3-3D.  Disruptor  separates  the  network  into  large  connected  components. 

We  observe  that  not  all  nodes  in  the  disruptor  at  the  30%  connectivity  level  are  in  the 
disruptor  at  10%  level.  Hence,  we  cannot  assume  nodes  in  the  disruptor  at  the  lower 
levels  is  the  superset  of  nodes  in  that  at  the  higher  level.  It  explains  why  centrality 
assessment  methods  in  which  nodes  are  selected  in  a  fixed  order  fail  to  exhibit  the 
vulnerability  of  the  network  at  different  disruptive  levels. 
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CHAPTER  4 

JOINT  LINK  AND  NODE  ATTACKS 


A  Node  attack 


B  Link  attacks 


C  Link-node  attack 


Figure  4-1 .  A)  After  removing  nodes  1  and  2  with  highest  degree,  the  network  remains 
connected  and  the  pairwise  connectivity  reduces  only  35%.  As  shown  in  a., 
the  solution  that  minimizes  the  connectivity  (nodes  3  and  7)  effectively 
breaks  the  network  into  two  parts,  disrupting  67%  connectivity.  B)  Minimum 
cost  solutions  to  reduce  50%  of  the  connectivity  assuming  links  have  cost  2 
and  nodes  have  cost  3  a.  node  only  &  b.  link  only  c.  joint  nodes  &  links.  The 
minimum  cost  is  6  if  attacking  only  nodes  or  only  links,  and  is  5  if  both  links 
and  nodes  are  targeted.  Thus,  it  is  insufficient  to  study  node  and  link  attacks 
separately. 


We  begin  with  a  network  sample  that  show  the  advantage  of  the  pairwise  connectivity 
metric  over  the  node  centrality  measures  in  Fig.  4-1  A.  Assume  two  nodes  are  to  be 
removed  from  our  simple  example.  If  the  two  nodes  are  selected  according  to  their 
degree  centrality,  nodes  1  and  2  will  be  removed  and  the  network  remains  connected. 
However,  if  we  remove  nodes  to  minimize  the  pairwise  connectivity,  nodes  3  and  7  are 
going  to  be  targeted,  and  the  network  is  effectively  broken  into  two  smaller  components. 
The  fraction  of  pairwise  connectivity  in  the  residual  network,  denoted  by  (3,  reduces 
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drastically  to  /3  =  |§  «  33%.  Apparently,  optimizing  the  pairwise-connectivity  metric 
reveals  more  accurate  insights  on  the  network  vulnerability. 

Fig.  4-1  also  illustrates  a  fundamental  shortcoming  of  existing  work:  the  ability  to 
assess  network  vulnerability  under  joint  node  and  link  attacks.  The  three  sub-figures 
show  the  minimum  cost  attack  strategies  to  reduce  (3  =  50%  pairwise  connectivity, 
assuming  each  link  has  cost  2  and  each  node  has  cost  3.  While  the  minimum  costs 
for  both  node-attack  (Fig.  4-1  A)  and  link-attack  (Fig.  4-1 B)  are  6,  the  minimum  cost  for 
node-link  attacks  (node  3  and  link  (6, 7))  (Fig.  4-1 C)  is  only  5.  Thus,  it  is  insufficient  to 
assess  link  vulnerability  and  node  vulnerability  separately  when  both  links  and  nodes 
in  the  network  can  be  targeted.  To  make  matters  worse,  assume  node  3  and  link  (6, 7) 
have  the  same  cost  e  >  0,  the  minimum  costs  for  node,  link,  and  node-link  attacks  will  be 
3  +  e,  4  +  e,  and  2e,  respectively.  As  the  ratios  (3  +  e) /(2e)  and  (4  +  e) /(2e)  go  unbounded, 
the  existing  methods  can  seriously  misjudge  the  network  vulnerability. 

To  address  the  shortcoming,  we  study  the  effect  of  joint  node  and  link  attacks 
in  term  of  connectivity.  We  introduce  a  new  problem,  called  /3-disruptor,  that  finds  a 
minimum  cost  set  of  nodes  and  links  whose  removal  degrades  the  pairwise  connectivity 
to  a  great  extent  (a  fraction  /3).  The  /3-disruptor  problem  aims  to  provide  a  more 
comprehensive  assessment  on  network  vulnerability.  It  generalizes  both  the  /3-vertex 
disruptor  and  the  ,5-edge  disruptor  problems  proposed  in  our  previous  work  [33].  To  our 
best  knowledge,  this  is  the  first  work  to  address  the  effect  of  simultaneous  attacks  on 
both  links  and  nodes  on  network  connectivity. 

Our  contributions  are  summarized  as  follows 

•  Providing  an  underlying  framework  toward  assessing  vulnerability  under  joint  n- 
ode  and  link  attacks  and  formulating  it  as  an  optimization  problem  ^-disruptor. 
Other  performance  measures  such  as  the  maximum  flow  between  a  given 
source-destination  pair,  the  average  maximum  flow  between  pairs  of  nodes, 
etc.  can  also  be  used  in  place  of  pairwise  connectivity  to  define  new  problems. 

•  Our  major  result  is  an  O  (y/logn)  bicriteria  approximation  algorithm  for  both 
undirected  and  directed  networks.  The  algorithm  finds  a  /3-disruptor  with  the  cost 
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at  most  O  (yiog n)  times  that  of  an  optimal  /3/-disruptor,  with  /?/  slightly  less  than 
/3.  We  propose  an  efficient  heuristic  using  recursive  spectral  bisection  and  variable 
neighborhood  search.  Finally,  our  experiments  on  both  synthetic  and  real-world 
datasets  indicate  the  efficacy  and  scalability  of  our  proposed  algorithms. 

We  briefly  present  terminologies  and  problem  definitions  in  Section  4.1 .  Then  we 

propose  the  0(y/logn)  bicriteria  approximation  algorithm  for  /3-disruptor  in  Section  4.2. 

Section  4.3  presents  the  efficient  heuristic  to  find  /3-disruptor.  We  obtain  numerical 

results  for  the  presented  algorithms  in  Section  4.4. 

4.1  Mixed  Removal  of  Nodes  and  Links 
Once  again,  we  abstract  our  general  network  model  as  a  graph  G  =  ( V. ,  E ),  where 
V  refers  to  a  set  of  nodes  and  E  refers  to  a  set  of  links.  Each  vertex  u  eV  is  associated 
with  a  cost  c(u)  <  0  and  each  edge  (u,v)  e  E  has  a  cost  c(u,v)  >  0.  For  convenience, 
we  also  denote  the  number  of  nodes  and  links  by  n  and  m,  respectively. 

If  G  is  an  undirected  graph,  a  vertex  pair  (u,  v)  e  V  x  V  is  connected  iff  there 
exists  a  path  between  u  and  v.  If  G  is  a  directed  graph,  a  vertex  pair  (u,  v)  is  said  to 
be  connected  if  there  exist  paths  between  u  and  v  in  both  directions.  We  denote  the 
pairwise  connectivity  of  a  graph  G  by  V{G).  Apparently,  the  pairwise  connectivity  is 
maximized  at  (”)  when  G  is  a  (strongly)  connected  graph.  For  convenience,  we  use  the 
word  component  to  refer  to  connected  component  in  undirected  graphs  and  strongly 
connected  component  (SCC)  in  directed  graphs  whenever  the  context  is  clear. 
/3-disruptor.  Given  0  <  f3  <  1,  a  /3-disruptor  is  a  pair  of  subsets 


Dp  =  (VpCV,EpCE) 

that  removal  from  G  will  make  the  pairwise  connectivity  in  the  residual  graph 
Gt  —  (V\  Vp,  E  \  ( Ep  u  Vpx  Vp))  to  be  at  most  /3(”).  The  /3-disruptor  problem  asks  for  a 
/3-disruptor  with  the  minimum  total  cost 

c(dp)  =  cO)  +  c (e)- 

u£Vp  e&Ep 
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There  are  two  special  types  of  /3-disruptor:  if  Vg  =  0,  then  Dp  is  a  (3-edge  disruptor ; 
and  if  Eg  =  0,  then  Dp  is  a  (3-vertex  disruptor.  The  uniform-cost  versions  of  (3-edge 
disruptor  problem  and  the  /3-vertex  disruptor  problem  are  previously  studied  in  [32]. 

4.1.1  Mixed  Integer  Linear  Programming 

The  /3-disruptor  can  be  formulated  as  an  Mixed  Integer  Linear  Programming  (MILP) 
problem  as  follows 


n 


minimize  ^  cusu  +  c(e)xe 

uev  eeE 

(4-1) 

subject  to  duv  ^  su  +  sv  +  xUV) 

(u,v)  e  E, 

(4-2) 

duv  “I-  dvw  ^  duw , 

(u,  v )  G  E,w 

(4-3) 

>(!-«©, 

U^V 

(4-4) 

$u  —  dUv  —  XUV) 

u,  V 

(4-5) 

Sui  xUv  £  {0}  1} ,  dUv  £  [0?  l] 

u,  V 

(4-6) 

where  su  =  1  if  node  u  is  removed  and  su  =  0  otherwise.  Similarly,  xuv  =  1  indicates 
the  removal  of  edge  (u,v).  The  variables  duv  represent  the  disconnectivity  (or  distance) 
between  nodes  u  and  v  in  the  residual  network  i.e.  dvl  =  1  if  i  and  j  is  disconnected  and 
duv  =  0  otherwise.  The  following  lemma  states  the  correctness  of  our  formulation.  The 
proof  is  similar  to  the  case  of  the  MILP  for  the  /3-vertex  disruptor  problem  in  [30],  and  is 
omitted  here. 

Lemma  8.  The  optimal  solution  of  ILP  (5-8-5-9)  induces  a  minimum  cost  (3-disruptor 
Dp  =  (Vp,  Eg )  of  G,  where  Vp  =  {u\su  =  1}  and  Eg  =  {(«,  v)  \  xuv  =  1}. 

4.1.2  Relation  between  edge  costs  and  vertex  costs 

Since  removing  either  u  or  v  causes  more  disruption  than  removing  the  edge  {u,v), 
we  have  the  following  lemma. 

Lemma  9.  An  edge  (u,v )  e  E  with  c(u,v)  >  mm{c(u),c(v)}  will  not  appear  in  any 
optimal  (3-disruptor  for  any  (3  >  0. 
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The  lemma  reflects  that  vertices’  costs  are  often  higher  than  the  costs  of  incident 
edges.  Similarly,  removing  a  vertex  should  not  cost  more  than  removing  all  the  incident 
edges. 

Lemma  10.  A  vertex  u  with  c(u)  >  J2(U:v)en  c(u> v)  W]^  not  aPPear  tn  any  optima 
/3-disruptor  for  any  (3  >  0. 

Lemmas  9  and  10  help  us  to  exclude  edges  and  vertices  with  “excessive”  removal 
costs  from  further  consideration.  From  the  perspective  of  protecting  the  critical 
infrastructures,  they  provides  relative  caps  for  how  much  extra  resource  we  should 
allocate  to  the  network  elements. 

By  definition,  ,5-vertex  disruptor  can  be  seen  as  a  special  case  of  /3-disruptor  when 
all  edges  have  infinity  costs  and  /3-edge  disruptor  is  a  special  case  of  /3-disruptor  when 
all  vertices  have  infinity  costs.  Since  both  vertex  and  edge  disruptor  are  NP-hard,  the 
/3-disruptor  problem  is  also  NP-hard  for  0  <  /3  <  1. 

4.2  Bicriteria  Approximation  Algorithm  for  Joint  Link  and  Node  Attacks 

In  this  subsection,  we  present  an  0(v/ log??)  bicriteria  approximation  algorithm  for 
the  /3-disruptor  problem.  Since  /3-vertex  disruptor  is  a  special  case  of  /3-disruptor,  the 
algorithm  implies  an  0(y/log  n )  bicriteria  approximation  algorithm  for  ,5-vertex  disruptor, 
which  improve  the  best  result  for  /3-vertex  disruptor,  the  O (log??  log  log??)  bicriteria 
approximation  algorithm  in  [33]. 

4.2.1  Algorithm  Description 

We  will  refer  to  the  input  network  as  the  original  network.  We  first  reduce  the 
/3-disruptor  problem  in  the  original  network  to  an  instance  of  the  /3-edge  disruptor 
problem  in  an  auxiliary  directed  graph.  The  reduction  maps  each  undirected  edge  to 
two  alternating  directed  edges  and  each  node  to  a  surrogate  edge.  More  importantly, 
we  show  that  the  reduction  ‘preserves’  relative  performance  guarantees.  We  then 
apply  a  recursive  cut  procedure  to  find  a  near-optimal  set  of  both  alternating  edges  and 
surrogate  edges  that  correspond  to  a  /3-disruptor  in  the  original  network. 
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Our  algorithm  JLNA(G)  to  find  /3-disruptor  in  directed  graph  G  is  summarized  in 
Algorithm  8.  In  the  first  phase,  the  algorithm  constructs  an  auxiliary  graph  Gt  by  splitting 
each  vertex  v  e  V  into  two  new  vertices  v+  and  v~.  Formally,  the  set  of  vertices  and 
edges  in  Gr  are  defined  as 


Vr  —  {  v  ,  v+  |  v  e  V} 

Et  =  |  v  G  V}  U  {(w+,  v~)  |  (u,v)  G  E} 

In  addition,  we  assign  costs  c/(.)  for  edges  in  Gr.  ci(v~,v+)  =  c(v)  for  the  surrogate 
edge  (w_,v+)  and  cr(u+,v~ )  =  c(v+,u~ )  =  c(u,v)  for  alternating  edges  ( u+,v~ )  and 
( v+,u~ ).  In  the  case,  E  is  a  mix  of  both  undirected  and  directed  edges,  we  also  convert 
each  directed  edge  (p,  q)  e  E  into  an  alternating  edge  ( p+ ,  q~ )  e  Et  with  a  cost 

c/(p+,q-)  =  c(p,  q). 

In  the  second  phase,  the  recursive  cut  procedure,  shown  in  lines  4  to  1 1 ,  construct 
a  /3-edge  disruptor  of  Gt,  denoted  by  Ep.  Here  for  a  given  /3/  <  P,  P  =  \{P  +  P0- 
The  /3-edge  disruptor  is  found  by  iteratively  applying  a  subroutine  SPARSE.CUT  on  the 
strongly  connected  components  in  Gr.  The  subroutine  SPARSE  CUT  cut  the  components 
into  smaller  ones  and  the  edges  in  a  subset  of  the  cuts  are  added  to  Ep.  The  process 
continues  until  the  pairwise  connectivity  in  the  graph  reduces  to  6 (”)  or  smaller.  By 
the  end  of  the  second  phase,  Ep  is  mapped  back  to  edges  and  nodes  in  G  to  give  a 
/3-disruptor. 

As  shown  in  lines  4  and  5,  the  subroutine  SPARSE.CUT  is  applied  to  each  strongly 
connected  component  C  to  find  a  minimum  ratio  cut  (St,  SV)  in  C.  The  cut  ratio  for  a  cut 
is  defined  as  follows. 

Definitions.  LetGr  =  (Vt,  Et)  be  a  directed  graph.  The  ratio  of  a  cut  (Sr, ~Sr)  is 
a(Sr)  =  where  cont(Sr)  is  the  total  cost  of  edges  coming  out  from  Sr.  In  addition,  a 
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Algorithm8:  JLNA(G) 

1 .  Construct  the  auxiliary  graph  Gt  =  (Vr,  Et) 

2.  £<-!O0  +  /9/) 

3.  Eg  <-  0 

4.  for  each  SCC  C  in  Gt 

5.  (CE,  Ca)  <-  SPARSE_CUT(C) 

6.  while  V{Gt)  >  /3Q) 

7.  Find  a  SCC  C*  of  Gt  with  minimum  cut  ratio  C*Q 

8.  Eg  ^  Eg  U  C*E 

9.  Remove  edges  in  C*E  from  G 

1 0.  for  each  new  component  Gt  in  G 

1 1 .  (CfE,  Cia)  <-  SPARSE_CUT(C7) 

12.  Vp  -e-  {u  |  (v_,t>+)  e  7^} 

13.  {(m,v)  |  (m_,v+)  e  7;^} 

14.  return  Dp  —  (Vp,  Eg) 


cut  with  the  minimum  cut  ratio  is  called  a  minimum  ratio  cut  and  denoted  by 

a(Gt)  =  min  a(Sf ) 
v  sicvi  v 

The  output  of  SPARSE  CUT  is  a  pair  (CE,  Ca),  where  CE  =  {St,  St)  and  Ca  = 
a(St).  For  simplicity,  we  postpone  the  description  of  SPARSE.CUT  til  the  proof  on  the 
approximation  ratio. 

In  the  main  loop  of  JLNA,  presented  in  lines  6  to  1 1 ,  for  each  round  we  select, 
among  the  existing  SCCs,  a  SCC  C*  in  G  that  has  the  smallest  cut  ratio.  Let  C*E  and  C* 
be  the  cut  set  and  the  cut  ratio  of  the  cut  found  by  SPARSE_CUT  in  C*.  We  add  CE  to  Eg 
and  remove  CE  from  G.  Removing  Eg  breaks  C*  into  two  or  more  strongly  connected 
components.  We  again  apply  SPARSE.CUT  on  those  components  to  find  the  minimum 
ratio  cuts. 

The  main  loop  terminates  when  the  pairwise  connectivity  in  G  is  no  more  than 
Then  we  construct  the  final  solution  by  mapping  each  surrogate  edge  ( v~ ,  v+ )  e  Eg  to 
the  node  v  in  G,  and  each  alternating  edge  ( u~ ,v+ )  e  Eg  to  the  edge  (u,v)  in  G. 
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4.2.2  Analysis  of  Approximation  Ratio 


We  show  that  the  JLNA  algorithm  is  an  Ofyiogn)  bicriteria  approximation  algorithm 
for  the  /3-disruptor  problem.  We  first  show  the  connection  between  the  cost  of  an 
optimal  /3-disruptor  and  the  minimum  cut  ratio  in  Lemma  1 1 .  After  that  we  derive  the 
approximation  ratio  for  JLNA  in  Theorem  4.1. 

It  is  not  obvious  to  see  the  connection  between  the  cost  of  an  optimal  ^-disruptor 
and  the  minimum  cut  ratio.  Cuts  in  directed  networks  have  different  characteristics  in 
comparison  to  their  counterpart  in  undirected  networks. 

First,  the  cut  ratios  of  (S,  S)  and  (S,  S )  are  different  in  general.  In  addition,  different 
cuts  may  associate  with  the  same  set  of  links.  For  example,  the  cuts  defined  by  S  = 
{blue  nodes},  and  S  =  {blue  and  green  nodes}  associates  to  the  same  set  of  links 
{(w,u)}.  To  treat  these  differences,  we  use  a  randomized  argument  in  the  following 
lemma. 

Second,  components  in  directed  networks  are  highly  interdependent.  As  illustrated 
in  Fig.  4-2,  the  failure  of  link  (u,v)  effectively  breaks  the  network  into  four  disconnected 
components.  Red  and  green  components  loose  the  communication  to  other  parts  of 
the  network,  even  none  of  their  incoming  and  outgoing  edges,  colored  in  black,  fail.  In 
contrast,  the  only  way  to  separate  a  component  from  the  rest  in  undirected  networks  is 
to  remove  all  links  incident  to  the  component. 

Nevertheless,  we  are  able  to  link  the  average  cost  to  disrupt  connected  pairs  in  an 
optimal  /3-disruptor  to  the  minimum  cut  ratio  in  the  following  lemma. 

Lemma  1 1 .  Given  a  directed  graph  G  =  ( V ,  E)  and  a  subset  of  edges  Mu  c  E,  if 
oj  =  V(G)  -  V(G[E  \  MJ\)  >  0,  then  >  |  amin(G),  where 

Omm(G)  =  min{a(C')  |  C  is  a  SCC  of  G}. 

Proof.  First,  we  prove  the  case  G  is  strongly  connected.  When  G  is  not  strongly 
connected,  the  lemma  can  be  proved  by  aggregating  the  results  on  SCCs  of  G. 
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Figure  4-2.  High  interdependence  of  networks’  elements.  Removing  the  marked  link 

(u,v)  breaks  the  (strongly)  connected  network  into  four  components.  Notice 
that  the  red  and  green  components  are  separated  from  the  others,  even 
when  none  of  the  incoming  links  to  or  outgoing  links  from  those  components 
are  removed. 

If  G  is  strongly  connected,  then  amm{G)  =  a(G )  and  V(G)  =  ("') .  Let  Ch C2,...,Ck 

be  SCCs  in  G[E  \  Mu\  and  let  C'i( V)  denote  the  set  of  vertices  in  component  C%.  We 
have  u  =  J2i<j  \Ci(y)\\Cj(V)\. 

Observe  that  if  we  contract  each  SCC  into  a  single  node,  we  obtain  the  graph  of 
SCCs  which  is  a  directed  acyclic  graph.  Thus,  there  is  a  topological  order  for  SCCs  and 
we  follow  the  convention  that  vertices  with  no  incoming  edges  will  have  the  smallest 
orders.  Thus,  w.I.o.g,  we  assume  that  the  removed  edges  always  come  from  SCCs  with 
higher  orders  to  SCCs  with  lower  orders. 

Consider  all  cuts  (S,  S)  of  G  that  satisfy  the  follows 

1  Either  C^V)  c  S  or  C^V)  c  S 

2lf  Ci(V)  c  S  and  there  exists  an  edge  from  Ci(V)  to  Cj{V)  in  G[E  \  Mu],  then  Cj  c  S. 
Clearly,  (S,  S)  c  Mw,  hence,  cout(S )  <  c(Mu).  For  a  given  pair  of  SCCs  Ci  and  Ck,  the 
probability  that  Ci(V)  and  Ck(V)  belong  to  different  sides  of  the  cut  is  at  least  1/3.  Since, 
there  are  four  possible  ways  of  assigning  Ct(V)  and  Ck(V)  to  two  sides  of  the  cut,  and  at 
most  one  out  of  four  is  forbidden  according  to  the  second  condition.  Thus,  \Ci(V)\\Ck(V)\ 
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pairs  of  vertices  between  C,  and  Ck  are  separated  with  probability  at  least  1/3.  Hence, 
the  expected  number  of  pairs  separated  by  a  cut  (S,  S)  is  at  least 


E[|5||S|]  >  l/3^|Cj(V)||Cj(V)|  =  1/3  w. 

i<3 

Among  the  cuts  satisfied  the  two  above  conditions,  there  must  be  a  cut  (S*,  S*)  that 
|5*||5*|  >  1/3  uj.  Then, 


a(G)  <  a(S*) 


Cout  ^  c(lWw) 

15*115*1  “  1/3  u; 


Hence,  the  lemma  follows  immediately. 

Now,  if  G  is  not  connected.  Let  Tx ,  T2, . . . ,  T,  be  SCCs  of  G,  and  let  be  the 
intersection  of  Mu  and  the  edges  in  Tjt  and  Tjt  be  the  subgraphs  obtained  from  Ti  after 
removing  M^3) .  Apply  the  above  result  for  the  case  the  graph  is  connected  on  each 
connected  component,  we  have 


c(AL)  =  Y.  <MY)  >  3  E  (pW)  -  PW')) 

3  3 

^  3^min(G)  E  -  p(T^)) 

3 

=  3 «min(G)  (V(G)  -  V(G[E  \  Mu])) 


Thus,  the  lemma  holds  for  every  graph  G.  □ 

The  quality  and  performance  JLNA  depend  on  the  selection  of  SPARSE  CUT.  For 
example,  an  exact  algorithm  to  find  minimum  ratio  cut  will  lead  to  a  constant  factor 
bicriteria  approximation  algorithm  for  ^-disruptor.  Unfortunately,  finding  the  min  ratio 
cut  is  an  NP-hard  problem  [9],  Thus  we  have  to  rely  on  approximation  algorithms  to  find 
good  ratio  cut  in  the  graph. 

Theorem  4.1.  For  any  fixed  0  <  /!/  <  /3,  algorithm  JLNA  finds  a  3-disruptor  of  cost  at 
mostO(y/ log  n)<f>( OPTg,) ,  where  OPT^,  is  the  cost  of  a  minimum  d'-disruptor. 
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Proof.  The  proof  consists  of  two  steps.  In  the  first  step,  we  prove  that  Dp  =  (Vp,  Eg)  is 
a  /5-disruptor  of  G.  In  the  second  step,  we  prove  that  the  cost  of  Dp  is  at  most  0(yf\ogn) 
times  the  cost  of  a  minimum  /5/-disruptor,  denoted  by  OPT^,. 

In  order  to  prove  that  Dp  is  a  /5-disruptor  of  G,  we  show  that  the  pairwise  connectivity 
in  G  after  removing  edges  in  G[-Dp ]  =  (V\Vp,E\  ( Ep  u  Vp  x  Vp))  is  at  most  /5(”). 

First,  observe  that  vertices  v~  and  u+  are  either  in  the  same  SCC  or  they  both  are 
isolated.  Here,  we  say  a  vertex  is  isolated  if  it  belongs  to  a  SCC  of  size  one.  Assume 
that  Gt[Ef  \  Ep]  can  be  decomposed  into  SCCs  Cit,  C2r,  and  2 1  isolated  vertices 
Wi,  wf, ... ,  wt,  wf.  Based  on  the  construction  of  Gr,  we  can  verify  that  there  are  l 
corresponding  SCCs  Ci,C2,  . . .  ,C/  and  t  isolated  vertices  wl)w2, . . .  ,wt  in  G[—Dp\. 
Moreover,  |CV|  =  2\Ci\  fori  =  1..1. 

Therefore,  we  have 

/8(?)  >V(Gl[E,\Et])  =  Y.G') 

i 

=  ('?')  +  E  lC<l  =  *P(G[-DP)  +  (n  -  t) 

i  i 

Since  ft  <  /5,  we  have 

P(G[-Df ])  <  i  (,3(2”)  -  (n  -  t))  <  /?© 

Thus,  we  have  completed  the  first  step.  We  prove  the  second  step  as  follows. 

Let  D*p,  =  (Vp/,  Ep,)  be  a  minimum  /5/-disruptor  i.e.  c(D*p,)  =  OPT^,.  Define 

Efp,  =  {(v~,v+)  |  v  G  Vp,}  U  {( u+,v~ )  |  (u,v)  G  Ep,}. 

By  mapping  SCCs  of  G[-D*p/]  to  those  of  Gr[Et  \  Efp,}  as  in  the  first  step,  we  can  show 
that  Efp,  is  a  /5/-edge  disruptor  of  Gr.  Thus, 

OPT^(G)  <  OPT %{&). 
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Since  /?/  <  /3,  by  Lemma  1 1  if  removing  a  set  of  edges  Mu  c  E  disrupts  u  pairs  of 
vertices,  then  >  l/3amin(G).  At  any  round  in  the  while  loop  of  RBA,  since  a  set  of 
edges  E*0/  in  a  minimum  /3/-edge  disruptor,  for  some  0  <  /3/  <  /3,  can  disrupt  at  least 
(f3  -  /3/)( ”)  more  pairs  in  G,  we  have 

OPT®  /  (03  -  /?/)(;))  >  l/3am,„(G).  (4-7) 


Since  our  cut  procedure  is  an  O(v'logn)  factor  approximation  algorithm  for  the  min 
cut  ratio  problem,  the  average  cost  to  disrupt  a  pair  by  removing  C*E  is  upper  bounded  by 
0(v/logn)amin(G).  By  (4-7),  the  average  cost  to  disrupt  pairs  in  the  graph  at  any  step  is 
at  most  0(v/Iogn)(OPT®/)/  ((/3  -  /3/)(!])).  Therefore,  even  when  Ep  disrupt  all  (")  pairs 
in  G,  the  total  cost  is  no  more  than 


0(^/logn)  x 


OPTf, 


0(y/\ogn ) 

(03-/3')©)  X©  -=  (P-M 


< 


x  OPT|. 


Thus  we  have 


c(^)  <  0(Vlog2n)OPTg(G/)  <  0(V/bg^)OPT^(G). 

That  yields  the  proof.  □ 

4.3  Hybrid  Meta-heuristic 

Our  second  choice  for  SPARSE  CUT  is  a  simple  yet  efficient  spectral  bisection 
method  [61].  The  /3-edge  disruptor  found  by  RBA  is  further  optimized  by  a  hybrid  of 
variable  neighborhood  search  [60]  and  simulated  annealing  [51].  Numerical  results 
in  Section  6.4  suggest  that  our  hybrid  method  is  competitive  for  the  /3-edge  disruptor 
problem. 

4.3.1  Spectral  Bisection 

Let  A  =  {cij}  be  the  cost  matrix  oW  =  ( V,E )  where  ctJ  =  c(vuvj)  is  the  cost  of 
edge  (v^Vj)  and  =  0  if  (v^Vj)  E.  The  unnormalized  graph  Laplacian  matrix  [61] 
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Algorithm  9:  SpectraLBisection(G') 

Compute  eigenvector  x  corresponding  to  A2  of  L 
Sort  entries  in  x 
for  p  =  1  to  n 

Calculate  ratio  of  the  cut  Sp  =  [vt  \ x{  <  xv } 
return  Sp  with  the  best  ratio  cut 


is  defined  as  L  =  D  -  A,  where  D  is  a  diagonal  matrix  with  the  weighted  degrees  of 
vertices  on  the  diagonal. 

The  matrix  L  is  symmetric  and  positive  semi-definite,  since  for  every  vector  iel” 
we  have 


xT Lx  =  -  Cij(xi  —  Xj )2  >  0. 


i,j= 1 


(4-8) 


L  has  n  non-negative,  real-valued  eigenvalues  Ai  =  0  <  A2  <  . . .  <  An.  The  second 
smallest  eigenvector  of  L,  X2,  is  known  as  the  algebraic  connectivity  of  the  graph  and 
can  be  used  to  describe  many  properties  of  graphs  [61].  We  shall  use  the  eigenvector 
corresponding  to  A2  to  derive  the  bisection  of  vertices  in  G. 

Recall  that  SPARSE  CUT  aims  to  find  the  min  ratio  cut 


mm 


c(S,  S) 


S cv  |5||5| 

Consider  a  vector  x  e  {0,  l}n  represent  a  set  of  vertices  in  S  i.e.  Xi 
Xi  =  0  otherwise.  We  rewrite  the  min  ratio  cut  problem  as 

T,{vi,vj)eEcn(xi-xj)2 


mm 

x£{0,l}n  ,x  ^0,1 


EiEj(xi~xjy 


(4-9) 

1  if  Vi  e  S  and 

(4-10) 


Since  the  problem  is  NP-hard,  we  relax  the  condition  x%  e  {0, 1}  to  x{  e  [0, 1].  Substitute 
x  with  vector  y  =  x-  After  some  algebra,  we  obtain  an  equivalent  problem  of  (4-10) 


1  yT  Ly 

min - - — 

w£o, j/-Li  n  y 1  y 


(4-11) 
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By  Courant-Fisher  theorem  [61],  the  solution  of  the  above  minimizing  problem  is 
exactly  the  eigenvector  corresponding  to  the  second  smallest  eigenvalue  of  A2.  So 
we  can  approximate  the  optimal  solution  of  the  min  ratio  cut  problem  with  the  second 
eigenvector  of  L  by  transforming  the  real-valued  x  into  a  zero-one  vector.  One  simple 
way  is  to  sort  the  x*  to  give  a  linear  ordering  of  the  vertices  then  determine  the  splitting 
index  p  that  yields  the  best  cut  ratio.  The  whole  procedure  is  summarized  in  Algorithm  9. 

Assume  that  the  eigenvalues  can  be  found  within  a  constant  number  of  iterations 
[54],  RBA  algorithm  will  have  an  0(n 2)  time  complexity. 

4.3.2  Hybrid  Meta-heuristic 

As  we  cannot  control  how  many  connected  pairs  SPARSE.CUT  will  separate,  RBA 
algorithm  usually  disrupts  more  connected  pairs  than  required,  resulting  in  less  optimal 
solutions.  Therefore,  in  order  to  further  improve  the  performance  of  RBA,  we  introduce  a 
hybrid  method,  using  both  simulated  annealing  [51]  and  variable  neighbourhood  search 
(VNS)[60].  The  simulated  annealing  makes  the  number  connected  pairs  converge  to  the 
desired  level,  while  the  local  search  methods  explore  alternative  solutions  to  reduce  the 
cost. 

For  /3-edge  disruptor  problem,  multiple  neighborhood  structure  is  essential  to  obtain 
high  quality  solutions.  To  find  minimum  /3-edge  disruptor,  we  aim  to  minimize  the  cut 
ratio  that  may  lead  to  disrupting  more  pairs  than  necessary  and  incurring  higher  costs. 
Alternating  among  neighborhood  structures  enables  us  to  seek  for  edge  disruptors  with 
both  small  ratio  cuts  and  small  costs.  Similar  to  simulated  annealing,  we  allow  “uphill” 
moves  that  increases  the  cost  of  the  solution  if  they  improve  certain  aspects  of  the 
solution. 

We  consider  four  different  neighborhood  structures.  From  a  solution  or  a  partial 
solution  Ep  c  E,  the  set  of  neighbors  in  each  neighborhood  structure  can  be  obtained 
as  follows 
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Algorithm  10:  HMH(G') 

Eg  •<—  0,  r  •(—  iiiin{/3, 1-/3} 
while  r  >  l/(”) 
r  <-  \t 

Eg  4r-  Eg  U  RBA(G[E  \Eg\,p-r) 

for  k  =  1  to  3  /*  Phase  1 :  Condensation  7 

repeat 

Consider  all  type  k  neighbors  Er0  that 

c(Efg)  <  c(Eg)  and  V(G[E  \  E/g ])  <  pQ) 

Find  among  them  E/g  with  the  smallest  cut  ratio 

Eg  <—  E/g 

until  no  change  in  Eg 
for  k  =  1  to  4  /*  Phase  2:  Exploration  7 

repeat 

Consider  all  type  k  neighbors  E/g  that 

(P-r)(-)<V(G[E\E/g])<(P  +  r)Q 

Find  among  them  E/g  with  the  smallest  cut  ratio 

Eg  •(—  E/g 

until  no  change  in  Eg 
return  the  best  solution  so  far 


•  Type  1 :  Merge  two  connected  components  in  G[E  \  Eg]  i.e.  remove  the  edges 
between  them  from  Eg. 

•  Type  2:  Move  a  vertex  from  one  component  to  an  adjacent  component  in  G[E\Eg\. 

•  Type  3:  Swap  places  of  two  adjacent  vertices  (u,v)  which  belong  to  two  different 
components. 

•  Type  4:  Partition  a  component  in  G[E  \  Eg)  with  Spectral  bisection. 

Beside  reducing  the  total  cost,  we  also  want  to  move  to  neighbors  with  smaller  cut 
ratio  which  is  defined  as 

a(Eg)  = - - 

1  V{G)  —  V(G[E  \  Eg]) 

The  cut  ratio  is  the  average  cost  to  disrupt  pairs  by  removing  edges  in  Eg.  We  use  the 
best  improvement  strategy  i.e.  among  eligible  neighbors  we  change  to  the  neighbor  with 
the  smallest  cut  ratio. 
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Our  hybrid  meta-heuristic  (HMH)  is  presented  in  Algorithm  10.  We  use  a  parameter 
r,  similar  to  the  heating  condition  in  Simulated  Annealing  [51],  to  control  how  far  the 
pairwise  connectivity  in  the  graph  can  diverge  from  the  target  connectivity  /3(!().  Every 
round,  r  is  reduced  by  half  until  it  is  negligibly  small.  The  algorithm  alternates  between 
two  phases:  condensation  and  exploration.  In  the  condensation  phase,  the  goal  is 
to  reduce  the  cost  of  the  current  /3-edge  disruptor.  As  mention,  we  do  not  favor  the 
neighbor  with  the  largest  decrease  in  cost  but  the  one  with  the  smallest  cut  ratio. 

In  the  exploration  phase,  we  emphasize  on  improving  the  cut  ratio  to  find  potential 
good  partition  of  the  network.  Moving  to  neighbors  with  higher  costs  is  possible  during 
this  phase  as  long  as  the  pairwise  connectivity  differs  at  most  t(”)  from  the  target 
connectivity  level  /3(”).  If  outcome  of  the  exploration  phase  is  not  a  /3-edge  disruptor,  the 
RBA  algorithm  is  invoked  to  produce  a  greedy  solution  before  the  algorithm  continues 
the  condensation  phase  again.  Finally,  the  algorithm  output  the  smallest  cost  /3-edge 
disruptor,  encountered  during  the  search. 

Since  the  algorithm  has  at  most  log  (”)  =  O(logn)  phases,  and  it  spends  at  most 
0(n3)  times  to  improve  the  solution  within  each  phase,  the  HMH  algorithm  has  a  time 
complexity  0(n3  logn).  In  our  experiments,  it  has  (almost)  the  same  running  time  with 
RBA  using  spectral  bisection  in  place  of  HMH,  which  has  an  0(n2)  time  complexity. 

Directed  HMH  Algorithm.  The  algorithm  to  find  a  sparse  cut  in  [2]  has  a  high  time 
complexity  0(n9-5)  as  it  requires  solving  a  large  Semidefinite  Programming.  Fortunately, 
we  again  can  turn  to  spectral  partitioning  to  find  small  ratio  cuts  in  directed  graphs  and 
further  optimize  the  solution  using  the  similar  techniques  in  Algorithm  HMH. 

The  major  change  is  to  replace  the  spectral  bisection  in  undirected  graph  with  one 
for  directed  graph.  This  can  be  done  by  transforming  the  asymmetric  adjacency  matrix  A 
to  a  symmetric  one  using  one  of  the  symmetrization  methods  such  as  ( A  +  AT)/2,AAT, 
etc.  [57],  Besides,  we  consider  a  new  type  of  neighborhood  (Type  5),  in  which  we  can 
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remove  (or  un-remove)  a  node  in  the  graph.  With  one  of  the  mentioned  symmetrization 
methods,  the  HMH  algorithm  for  directed  graph  also  has  the  time  complexity  0(n3  logn) 

4.4  Experimental  Studies 


B(%) 


(3(%) 


A  a  =  0  and  b  =  0.25 


B  a  =  0  and  b  =  2.25 


I3(%) 

C  a  =  0.25  and  b  =  2.25 

Figure  4-3.  The  normalized  optimal 
Backbone  network. 


(3(%) 

D  a  =  1.25  and  b=  1 

of  three  different  disruptor  types  on  the  US 


We  illustrate  through  our  experiments  the  need  to  assess  network  vulnerability 
under  joint  node  and  link  attacks. 

4.4.1  Experiment  Setups 

4.4.1. 1  Datasets 

The  experiments  are  performed  on  three  real  communication  networks,  namely 
IP  Backbone[1],  CAIDA  AS[56],  and  Oregon  AS[56],  and  a  set  of  four  synthesis 
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networks  described  in  subsection  4.4.3.  The  network  details  are  given  in  the  subsequent 
subsections  and  the  references. 

4.4.1. 2  Removal  costs  schemes 

Assigning  meaningful  costs  for  edges  and  vertices  is  a  challenging  task  which 
usually  depends  on  the  availability  of  the  data.  For  simplicity,  we  assume  that  all 
edges  has  uniform  removal  costs  c(e)  =  lVe  e  E.  Note  that  we  can  always  multiply 
simultaneously  edge  and  vertex  costs  with  a  constant,  then  all  optimal  disruptors  stay 
optimal  (with  the  costs  multiplied  by  the  same  constant).  We  assign  the  cost  of  removing 
a  vertex  u  to  be  c(u)  =  b  +  ad(u),  where  b  and  a  are  non-negative  constants.  In 
other  words,  attacking  a  node  requires  paying  a  base  cost  b  and  an  extra  cost  that  is 
proportional  to  the  degree  centrality.  Other  centrality  measurements  e.g.  PageFtank, 
Betweeness  centrality  can  also  be  used  in  place  of  d(u)  to  weight  the  us  importance. 

4.4.1. 3  Finding  the  optimal  disruptor 

The  optimal  solutions  are  found  by  solving  the  MILP  in  Section  4.1.1  with  the 
sparse  metric  and  advanced  plane  cutting  techniques  in  our  previous  work  [30].  The 
mathematical  optimization  package  to  solve  the  IP  is  GUROBI  4.5. 

4.4.1. 4  Solving  for  the  second  eigenvector 

The  major  time  of  HMH  (Algorithm  10)  spends  on  finding  the  second  smallest 
eigenvector  of  the  Laplacian  matrix.  The  eigenvectors  are  found  using  the  Implicitly 
Restarted  Arnoldi  Method,  implemented  in  ARPACK  [54],  We  use  SuperLU  [29]  as  the 
linear  systems  solver. 

We  use  the  Shift  and  Invert  spectral  transformation  to  enhance  the  convergence 
rate  1  .  We  select  a  scalar  a  =  0.01,  called  the  shift,  and  transform  the  original  problem 
Lx  =  Xx  into  the  shift-and-invert  problem  (L  -  a/)_1a;  =  px  where  p  =  1/(A  -  a).  Note 


1  In  many  cases,  the  regular  mode  does  not  converge  after  20,000  iterations. 
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Figure  4-4.  Costs  of  disruptor  algorithms  on  the  synthesis  networks 

that  setting  a  =  0  will  crash  ARPACK  since  L  is  non-invertible  (sum  of  rows  equal  zero) 


In  the  case  of  JLNA,  spectral  bisection  is  performed  on  the  symmetrized  matrix 
At  +  A/t,  where  A/  is  the  adjacency  matrix  of  the  auxiliary  graph  Gr,  constructed  in 
Algorithm  8. 
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Figure  4-5.  Running  time  of  disruptor  algorithms  on  the  synthesis  networks 

4.4.1. 5  Implementation  details 

All  algorithms  are  implemented  in  C++  and  compiled  with  GCC  4.4  compiler  on  a  64 
bit  Linux  machine  with  a  Quad-core  AMD  Opteron  2350  2.0  Ghz  processor  and  32  GB 
memory.  Only  a  single  core  is  used  during  the  experiments. 

4.4.2  Comparison  of  the  three  disruptor  types 

Before  analyzing  the  experimental  results,  given  in  Fig.  4-3,  for  three  different 
disruptor  types  (edge,  vertex,  and  general),  we  summarize  the  provable  connections 


2  The  ’eigs’  function  to  find  eigenvalues  in  MATLAB  crashes  for  this  reason. 
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among  those  three.  First,  the  cost  of  optimal  /3-disruptor  is  always  less  than  the  costs  of 
both  /3-edge  disruptor  and  /3-disruptor.  Second,  if  the  cost  scheme  c(u)  =  b  +  ad{u)  is  in 
use,  we  can  tell  when  the  cost  of  the  optimal  6eta-disruptor  meets  exactly  the  minimum 
of  that  of  /3-edge  disruptor  and  /3-vertex  disruptor. 

Note  that  if  c{u)  <  c(u,  v)  for  some  (u,  v)  e  E,  then  the  edge  (u,  v)  should  not  be 
removed  (as  we  can  remove  u  instead).  Similarly,  a  node  u  with  c(u)  >  J2(u,v)eEc(u’v) 
will  not  be  removed  since  we  can  always  remove  all  of  its  incident  edges.  Therefore,  we 
obtain  the  following  properties. 

•  a  =  0,  b  <  1:  OPTg  =  OPT^  <  OPT®  i.e.  the  optimal  /3-disruptor  contains  no 
edges. 

•  a  =  0,  b  >  1:  the  optimal  solutions  contain  no  u  with  d(u)  <  b. 

•  0  <  a  <  1:  the  optimal  /3-disruptor  contains  only  vertices  of  degree  at  least 

•  1  <  a:  OPT^  =  OPT®  <  OPT^  i.e.  the  optimal  /3-disruptor  contains  no  vertices. 
We  test  four  different  settings  of  a  and  b  that  correspond  to  the  above  four  cases  on 
the  fiber  backbone  operated  by  a  major  U.S.  network  provider  [1],  The  optimal  costs 
of  three  disruptor  types  are  shown  in  Fig.  4-3.  In  Fig.  4-3A,  the  costs  of  /3-disruptor 
equal  exactly  the  cost  of  ,5-vertex  disruptor;  in  Fig.  4-3D,  the  costs  of  /3-disruptor  equal 
exactly  the  cost  of  /3-edge  disruptor.  These  agree  with  the  above  four  mentioned  cases. 
However,  for  Figs.  4-3B  and  4-3C,  the  costs  of  /3-disruptor  are  strictly  less  than  the 
minimum  of  both  edge-disruptor  and  vertex-disruptor.  In  addition,  for  small  /3  the  cost  of 
edge-disruptor  is  less  than  that  of  vertex-disruptor,  while  for  large  /3  the  vertex-disruptor 
has  substantially  smaller  cost.  This  suggests  that  small  scale  attacks  should  target 
links,  while  large  scale  attacks  should  pay  more  attention  to  nodes  to  reduce  the  attack 
cost.  Nevertheless,  a  combination  of  both  node  and  link  attacks  would  result  in  a  more 
cost-effective  strategy  to  break  the  network. 
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4.4.3  Synthesis  Networks  of  Different  Topologies 

We  test  our  algorithms  on  synthesis  moderate-sized  networks  to  1)  compare  the 
solutions  of  our  algorithms  to  that  of  the  optimal  solutions  obtained  by  solving  Integer 
programming  (IP),  and  2)  verify  the  performance  of  the  algorithms  across  different 
network  topologies.  Four  synthesis  networks  of  100  nodes  and  approximately  200  edges 
are  generated  following  below  complex  network  models. 

•  Erdos-Reyni:  A  random  graph  of  100  vertices  and  200  edges  following  the 
Erdos-Reyni  model  [36]. 

•  Barabasi-Albert:  A  power-law  model  using  preferential  attachment  mechanism 
[12]. 

•  Watts-Strogatz:  A  random  graph  which  exhibit  small-world  phenomenon  following 
model  [79]  with  the  dimension  of  the  lattice  2  and  the  rewiring  probability  0.3[79]. 

•  Forest  fire:  A  random  power-law  graph  following  Forest  fire  model  by  Leskovec 
et  al.  [56]  with  the  forward  and  backward  burning  probabilities  0.3  and  0.9, 
respectively. 

Set  up.  We  show  the  costs  produced  by  both  types  of  disruptors  /3-edge  disruptor 
and  /3-disruptor  in  Fig.  4-4.  The  measured  /3-edge  disruptor  algorithms  are  RBA  (the 
RBA  algorithm  using  spectral  bisection  in  the  place  of  SPARSE.CUT),  HMH,  and  optimal 
,5-edge  disruptor  (Opt.  edge  dis.)  ;  and  the  measured  /3-disruptor  algorithms  are 
JLNA  and  optimal  /3-disruptor  (Opt./3-dis.).  The  costs  of  vertices  follow  the  linear  scale 
c(u)  =  b  +  ad(u)  where  a  =  0.25  and  b  =  0.25. 

/3-edge  disruptor.  Among  /3-edge  disruptor  algorithms,  HMH  matches  the  optimal 
solutions  obtained  by  solving  IP  (Opt.  edge  dis)  most  of  the  times;  and  in  other  cases, 
the  gap  between  the  two  are  negligibly  small.  However,  there  are  much  larger  gaps 
between  RBA’s  solutions  and  the  optimal  solutions.  The  reason  RBA  is  less  competitive 
is  that  it  may  separate  many  more  node  pairs  than  the  required,  especially  when  /3  is 
small.  For  example,  when  /3  =  80%  the  cost  of  RBA  is  more  than  three  times  higher  than 
that  of  the  HMH  algorithms  and  the  optimal  solutions.  This  implies  that  the  annealing 
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procedure  and  the  VNS  in  HMH  are  capable  of  correcting  the  overcutting  caused 
by  RBA. 

/3-disruptor.  The  /3-disruptor  costs  found  by  JLNA  also  closely  approach  those 
of  the  optimal  (Opt.  /3  dis.).  On  average,  the  solution  of  JLNA  is  only  15%  larger  than 
the  optimal.  However,  the  gap  between  JLNA  and  the  optimal  is  less  impressive  than 
that  between  HMH  and  the  optimal  edge  disruptor.  The  reason  is  possibly  due  to  the 
fact  that  we  simply  symmetrize  the  adjacency  matrix  to  find  the  directed  cut;  and  the 
JLNA’s  performance  can  be  enhanced  with  better  directed  spectral  cut  algorithms. 
Nevertheless,  the  cost  of  JLNA  is  substantially  less  than  that  of  the  optimal  edge 
disrutpor.  Thus,  JLNA  is  able  to  reveal  vulnerability  at  both  links  and  nodes  in  the 
network. 


Removal  cost  Running  time  (s) 


Figure  4-6.  Oregon  AS  network 

Running  time.  Fig.  4-5  shows  the  running  time  on  four  synthesis  networks.  Both 
the  two  IPs  take  excessive  amounts  of  time  (up  to  1 0  hours)  to  return  the  solution  on 
Erdos-Reyni,  Barabasi  and  Watts-Strogatz  network.  In  contrast,  RBA,  HMH  and  JLNA 
take  less  than  one  second  to  complete  in  all  cases.  Since  JLNA  algorithm  splits  the 
nodes  in  the  network  (thus,  double  the  network  size),  it  has  slightly  higher  running  time 
than  those  of  RBA  and  HMH. 
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Removal  cost 


Running  time  (s) 


Figure  4-7.  CAIDA  AS  network 

Overall,  HMH  and  JLNA  algorithms  prove  to  be  excellent  choices  to  find  disruptors  in 
moderate-sized  networks.  They  produce  high  quality  solutions  within  a  short  amount  of 
time  and  the  performance  is  stable  across  different  network  topologies.  We  further  study 
the  performance  of  HMH  and  JLNA  on  larger  (real)  communication  networks. 

4.4.4  AS  Relationships  Networks 

We  analyze  HMM,  and  JLNA  3  ,  with  the  same  settings  in  the  last  subsection,  on  the 
following  AS  relationships  datasets. 

•  CAIDA  AS:  The  CAIDA  AS  Relationships  Dataset  from  Sep.  17,  2007  [56]  with 
8,020  nodes  and  36,406  links. 

•  Oregon  AS:  AS  peering  information  inferred  from  Oregon  route-views  between 
March  31  and  May  26,  2001  [56].  Only  the  largest  connected  component  with 

1 1 ,1 74  nodes  and  23,41 0  links  is  considered. 

The  costs  and  running  times  are  reported  in  Figs.  4-6  and  4-7.  Similar  to  the  cases  of 
synthesis  networks,  the  HMH  algorithm  continue  to  produces  solutions  with  significantly 
smaller  costs  than  RBA’s.  For  /3  =  80%  and  (3  =  60%,  the  cost  of  HMH  is  less  than  half  of 
that  in  RBA.  In  addition,  both  algorithms  have  almost  identical  running  time.  The  major 
portion  of  time  is  spent  on  finding  the  second  eigenvector;  and  performing  local  search 


3  The  IPs  cannot  handle  networks  with  more  than  few  hundred  nodes. 
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and  annealing  procedure  is  relatively  inexpensive.  Obviously,  HMH  dominates  RBA  with 
higher  quality  solutions  and  within  (almost)  the  same  running  time. 

Note  that  computing  eigenvector,  the  bottleneck  in  our  algorithms,  can  be  done 
efficiently  in  a  distributed  manner.  For  example,  Google  is  capable  of  solving  the 
eigenvalue  problem  on  the  web  network  of  billions  of  nodes.  Thus,  the  proposed 
solutions  are  scalable  for  much  larger  networks. 

Joint  node  and  link  attacks  pose  a  serious  threat  to  the  network.  In  addition  to 
network  connectivity,  it  is  also  important  to  assess  the  vulnerability  of  the  network  under 
joint  node  and  links  networks  in  terms  of  other  performance  metrics  such  as  network 
throughput,  maximum  network  flow  between  source-destination  pairs,  and  so  on. 
Furthermore,  the  problem  of  allocating  resource  to  protect  the  network  under  the  joint 
attacks  is  of  great  importance  and  is  the  topic  of  our  future  study. 
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CHAPTER  5 

VULNERABILITY  ASSESSMENT  IN  PROBABILISTIC  NETWORKS 

We  investigate  the  vulnerability  of  probabilistic  networks  under  multiple  attacks. 

That  is  we  aim  to  identify  the  most  critical  subsets  of  infrastructure  whose  removal 
maximize  the  disruptive  effect  on  the  network  in  term  of  connectivity.  Finding  such  a 
subset  is  extremely  challenging  due  to  the  uncertainty  of  the  network  topology  and  the 
exponentially  large  number  of  attack  schemes.  We  show  that  finding  exact  solution  is 
computationally  intractable  and  propose  an  efficient  two-stage  stochastic  programming 
to  approximate  the  identification  of  the  critical  infrastructure.  Furthermore,  we  propose 
a  novel  sampling  scheme,  which  find  solutions  with  guaranteed  probabilistic  accuracy. 
Finally  we  demonstrate  the  effectiveness  and  efficiency  of  the  proposed  algorithms  on 
real  and  synthetic  data  sets. 

Disruptive  events,  ranging  from  natural  disasters  to  malicious  attacks,  can 
drastically  compromise  the  network’s  ability  to  meet  its  quality-of-service(QoS) 
requirements,  if  not  cause  widespread  service  outages  and  potentially  total  network 
breakdown  [46,  62,  64,  68].  Moreover,  there  is  a  significant  concern  over  critical 
infrastructures  in  electrical  power  grids  and  highway  systems  as  targets  for  terrorist 
attacks  [67],  To  mitigate  the  risk  and  develop  proactive  responses,  it  is  essential  to 
assess  network  vulnerability  to  identify  the  most  destructive  attack  scenarios. 

Although  there  has  been  a  significant  amount  of  work  on  assessing  network 
vulnerability,  most  previous  works  focus  mainly  on  using  centrality  measurements 
e.g.  degree,  betweeness,  and  closeness  centralities  [6,  7]  to  identify  critical  links  or 
nodes.  Unfortunately,  these  approaches  only  determine  the  relative  importance  of  a 
small  number  of  nodes  or  links  and  cannot  reveal  the  enormous  damage  potential 
caused  under  simultaneous  attacks.  Other  set  of  works  studies  links  and  nodes  removal 
problems  that  optimize  several  global  graph  measures,  such  as  clustering  coefficient, 
network  diameter,  etc.  However,  these  measures  do  not  cast  well  for  particular  kinds 
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of  network  vulnerability,  when  the  network  connectivity  is  of  high  priority.  To  this  end, 
pairwise  connectivity,  the  number  of  node  pairs  that  remain  connected,  has  been 
recently  used  as  an  effective  measure  to  account  for  the  effect  of  the  attacks  [1 1 ,  30,  33, 
62,  64]. 

5.1  Probablilistic  Networks 

In  this  section,  we  first  present  the  considered  probabilistic  network  model,  followed 
by  the  formal  definition  of  the  studied  problems. 

5.1.1  Probabilistic  Network  Model 

The  network  with  uncertain  links  is  modeled  using  a  tuple  Q  =  (V,  E,p)  where 
vertices  in  V  corresponds  to  the  set  of  nodes,  edges  in  E  corresponds  to  the  set  of 
links  in  the  network,  and  p  :  E  ->  [0, 1]  maps  each  edge  (u,v)  e  E  to  a  real  number  in 
puv  e  [0, 1]  that  represents  the  availability  of  (u,v)  and  puv  =  0  for  all  (u,v)  £  E.  Further, 
denote  by  £  the  adjacency  matrix  of  Q,  i.e.,  Pr[£u„  =  1]  =  puv  and  Pr[£u„  =  0]  =  1  -  puv 
for  all  pairs  (u,v).  For  clarity,  we  consider  only  undirected  networks  and  assume  that 
the  existing  of  edges  are  independent  of  one  another,  though  our  approaches  also 
apply  in  principle  to  directed  graphs  or  graphs  with  edge  correlations  as  long  as  we  can 
effectively  generate  samples  of  the  probabilistic  graph. 

A  sample  graph  (or  a  realization)  Gl  =  ( V. ,  El)  of  Q  is  generated  by  selecting  each 
edge  e  e  E  with  probability  p(e).  The  sample  space  Sg  consists  of  N  =  2|£|  possible 
samples  Sg  =  {Gl  =  (V,  E1),  G2  =  (V,  E2), . . . ,  GN  =  ( V. ,  EN)j  of  G  that  correspond  to 
2|e|  possible  subsets  of  E.  The  probability  that  Gl  is  sampled  from  Q  is  given  by 

fg(Gi) = Pr[c = g'i = n  Pe  n  (i-pj 

e&E'  eGE\Ei 

Moreover,  the  matrix  {£*}„„  are  used  to  denote  the  adjacency  matrix  of  Gl. 

5.1.2  Expected  Pairwise  Connectivity 

As  mentioned  above,  our  measure  for  the  disruptive  effect  is  based  on  the  value  of 
pairwise  connectivity  (EPC),  which  is  the  number  of  (expected)  connected  pairs  in  the 
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residual  network.  For  a  deterministic  graph  Gl,  the  pairwise  connectivity,  denoted  by 
V{Gl),  is  the  number  of  pairs  (u,v)  with  at  least  one  path  between  u  and  v.  Naturally,  the 
expected  pairwise  connectivity  for  the  probabilistic  graph  Q  is  defined  as 

N 

EPC(CJ)  =E[P(ff)]  =Y,fe(G‘)'P(Gi). 

1=1 

Lemma  12.  Given  a  probabilistic  graph  Q  =  ( V. ,  E,p),  we  have 

EPC(^)  =  i  ^  REL^(£) 

u,vEV 

where  RELU)„((7)  is  the  probability  that  v  is  reachable  from  u  within  Q. 

Partial  order.  Given  two  probabilistic  graphs  Qa{Va ,  EA,pA )  and  QB(VB,  EBlpB), 
we  say  QA  is  dominated  by  QB,  and  write  GA  <  GB,\\\VA  c  VB,EA  c  EB,  and 
pf  <  pB  Ve  e  Ea. 

Lemma  13.  Given  two  probabilistic  graphs  GA  andGB,  ifGA  ^  GB,  then  EPC(QA)  < 
EPC(Gb). 

5.1.3  Vulnerability  Assessment 

We  define  the  following  problems  based  on  their  deterministic  versions  in  [10,  33]. 
On  one  hand,  the  number  of  nodes/  edges  to  remove  is  given,  and  we  wish  to  maximize 
the  expected  disruptive  effect. 

^-Probabilistic  Critical  Nodes  Problem  (A-pCNP).  Given  a  probabilistic  network 
Q  =  (V,  E,p)  and  an  integer  0  <  k  <  n,  find  a  k  nodes  subset  5  c  V  that  removal 
minimizes  the  expected  pairwise  connectivity  in  the  residual  network  after  removing  the 
nodes  in  S. 

5.2  Estimation  of  Connectivity  in  Probabilistic  Networks 
5.2.1  #P-Completeness 

In  this  paper,  we  first  show  that  computing  expected  pairwise  connectivity  in  a 
probabilistic  network  is  #P-complete.  A  computation  problem  /  in  #P  is  said  to  be 
#P-complete  if  every  problems  in  #P  is  reducible  to  /.  Here  #P  is  the  class  of  “counting 


99 


version”  of  problems  in  NP,  i.e.,  they  are  problems  of  the  form  “compute  the  number 
of  solutions”  for  a  problem  in  NP.  Showing  that  a  computation  problem  is  #P-complete 
makes  a  strong  statement  about  its  intractability:  if  such  a  problem  were  computable  in 
polynomial  time  then  not  only  P=NP  but  also  P=PH. 

Theorem  5.1 .  Computing  the  expected  pairwise  connectivity  EPC(G'),  given  a  probabilis¬ 
tic  graph  G,  is  #P-compiete. 

Proof.  We  prove  the  theorem  by  a  reduction  from  the  counting  problem  of  s  —  t 
connectedness  in  an  undirected  graph  [75].  The  problem  is  to  count  the  number  of 
subgraphs  of  a  graph  G  in  which  there  is  a  path  from  s  to  t.  The  problem  is  equivalent 
to  computing  the  probability  that  s  is  connected  to  t  when  each  edge  in  G  has  an 
independent  probability  1/2  of  being  connected,  and  another  1/2  to  be  disconnected. 

We  reduce  this  problem  to  the  expected  pairwise  connectivity  computation  problem 
as  follows.  We  first  construct  four  probabilistic  graphs  G0l  Gi,  G2,  G3,  where 

•  G0  =  G  and  p(e)  =  1/2  for  all  e  e  E. 

•  Gi  is  obtained  from  G0  by  adding  a  new  node  st  and  an  edge  (s,  st)  with  p(s,  st)  = 

1. 

•  G2  is  obtained  from  G0  by  adding  a  new  node  t/  and  an  edge  (t,  tt )  with  p(t,  tr)  =  1. 

•  G3  is  obtained  from  G0  by  adding  nodes  st  and  tr,  and  edges  (s,  st),  and  (f,  tr)  with 
probabilities  p{s,st)  =p(t,tt)  =  1. 

Next  we  compute  P0  =  EPC(G0),Pi  =  EPC(Gi),P2  =  EPC(G2),  and  P3  =  EPC (P3). 

Then,  we  can  return  P0  -  P3  -  P2  +  P3  as  the  probability  that  s  is  connected  to  t  and 
thus  we  solve  the  s  - 1  connectedness  counting  problem.  In  addition,  there  is  an  obvious 
reduction  from  the  expected  pairwise  connectivity  computation  problem  to  the  s  -  t 
connectedness  problem  via  the  equality  EPC(£)  =  Y,uyv  RELUi„((7).  It  is  shown  in  [75] 
that  s  -  t  connectedness  is  #P-complete,  and  thus  the  expected  pairwise  connectivity 
computation  problem  is  also  #P-complete. 
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Finally,  we  prove  that  P0  -  P1  -  P2  +  P3  =  RELA,  t(G),  the  probability  that  s  and 
t  is  connected  in  G.  By  the  construction  of  G±,  we  have  Pi  =  P0  +  ^2veV  RELs/i„(G)  = 

Po  +  Evev  REL 8iV(G).  Similarly,  P2  =  P0  +  Ev&v  REU>t(G)  and  P3  =  Po+Evev  RELS,„(G)  + 
YjV&v  REL„jt(G)  +  RELM(G).  It  is  straightforward  to  verify  that  RELSit(G)  =  P0  -  Pi  -  P2  + 

P3.  Q 

We  are  interested  in  (e,  ^-approximations  for  EPC(£?),  i.e.,  algorithms  returning  an 
estimate  of  EPC(£)  accurate  to  within  a  relative  error  of  e  with  probability  at  least  1-5. 
Formally,  we  define  (e,  ^-approximations  as  follows. 

Definition  7  ( (e,  (^-approximation).  A  function  F(G)  is  an  (e,  5) -approximation  for  the 
expected  pairwise  connectivity  E\V(G)]  if 


Pr 


(1  -  e)E [P(G)\  <  F(G )  <  (1  +  e)E \V{G)] 


>1-5 


An  (e,  5)-approximation  is  called  a  fully  polynomial  randomized  approximation 
scheme  (FPRAS)  if  its  running  time  is  bounded  by  a  polynomial  in  l/e,log(l/S),  and 
the  input  size.  An  FPRAS  is  generally  regarded  as  a  robust  notion  of  “approximation 
algorithm”  for  counting  problems.  Sinclair  and  Jerrum  showed  that  every  #P-complete 
problem  either  has  an  FPRAS,  or  is  essentially  impossible  to  approximate  [70]. 

5.2.2  Monte-Carlo  Methods  to  Approximate  EPC 

We  present  a  simple  Monte-Carlo  algorithm  to  estimate  the  EPC  in  Algorithm  11. 
The  algorithm  draws  N^e,  5)  samples  of  Q.  Each  sample  is  generated  by  including  each 
edge  e  e  E  with  probability  pe.  The  average  pairwise  connectivity  in  the  !Vi(e,  5)  sample 
graphs  is  computed  and  returned  as  an  unbiased  estimator  for  EPC(£?). 


101 


Algorithm  11.  (e,5)  Monte-Carlo  Algorithm  to  compute  EPC(£) 

1 .  Cx «-  0. 

2.  for  i  =  1  to  iVi(e,  5)  do 

•  Draw  a  sample  graph  Gl  of  Q. 

•  Cx  =  C1  +  V{Gi). 

3.  Return  ^  as  an  unbiased  estimator  of  EPC(£). 


The  number  of  necessary  samples  to  be  drawn,  denoted  by  Nx(e,  5),  is  derived 
based  on  the  following  Generalized  Zero-One  Estimator  Theorem  introduced  by  Dagum 
et  al.  [28]. 

Theorem  5.2.  (Generalized  Zero-One  Estimator  [28])  Let  X1,X2, ...  ,XN  be  indepen¬ 
dent  identically  distributed  random  variables  taking  values  in  [0, 1],  with  mean  y  >  0  .If 

0  <  e  <  1  and  N  >  4(e  -  2)  ln(2/cx)l/(e2/i)>  where  e  «  2.718  is  Euler’s  number,  then 


Pr 


N 


(1  -  e)y  <  —  xt  <  (1  +  e)y 


i=  1 


>1-5. 


By  applying  Theorem  5.2  to  the  i.i.d.  random  variables  X,:  =  V(Gi)/(^)  with  mean 
/j  =  EPC(^)/ (”),  we  obtain  the  following  lemma. 

Lemma  14.  if  Nx{e,  5)  >  4(e  -  2)  In  2  then  S1  is  an  (e,  5)-approximation. 

Time  complexity.  The  time  to  draw  a  sample  and  compute  the  pairwise  connectivity 
is  0(m  +  n).  Since  we  often  regard  5  as  a  constant,  Algorithm  1 1  has  a  time  complexity 

0((m  +  n)n2e-2EPC(^)-1). 

If  EPC(£)  is  bounded  below  by  1  /poly(n;m),  then  Algorithm  11  is  an  FPRAS.  The 
difficult  case  is  the  estimation  of  small  values  of  EPC(£?),  where  the  algorithm  is  no 
longer  a  polynomial-time  algorithm.  This  motivates  the  construction  of  better  estimation 
methods  to  be  presented  next. 
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5.2.3  Fully  Polynomial  Time  Approximation  Scheme 
5.2. 3.1  Component  Sampling  Algorithm 

We  present  an  importance  sampling  method  to  estimate  EPC(£)  in  Algorithm  12.  In 
stead  of  generating  the  whole  sample  graph  as  in  Algorithm  1 1 ,  we  select  a  node  u  e  V 
uniformly  and  perform  a  Bread-First  Search  procedure  from  u,  until  reaching  all  nodes  in 
the  connected  component  that  contains  u.  The  algorithm  then  computes  the  average  of 
the  size  of  the  component  that  contains  u  less  one,  and  multiply  the  result  by  n  to  obtain 
an  unbiased  estimator^- 

Algorithm  12.  (e,  5)  Component  Sampling  Algorithm  to  compute  EPC(£) 

1  .  Let  PE  = 

2.  if  PE  <  1/m  then 

3.  return  S2  =  PE. 

4.  C2  ^ —  0. 

5.  for  i  =  1  to  N2(e,S)  do 

•  Select  a  node  ueV  uniformly. 

•  Simulate  a  Breath-First  Search  from  u  in  Q.  Let  St  be  the 
number  of  visited  nodes. 

•  C2  =  C2  +  (Si- 1). 

6.  Return  S2  =  as  an  unbiased  estimator  of  EPC (Q). 


Theorem  5.3.  For  N2(e,  5)  =  4(e  -  2)  In  | e^EPC(g) >  /s  an  (e>  S) -approximation  for  EPC(£). 

Proof.  In  the  main  loop  of  Algorithm  12,  we  can  compute  St  with  the  following  equivalent 
steps:  1)  Draw  a  sample  graph  Gim,  and  2)  Select  a  node  u  in  Gl  uniformly  and  compute 
Si  as  the  size  of  connected  component  that  contains  u.  Assume  that  there  are  t 
connected  components  with  sizes  si,s2, ...  ,sk  in  G\  Then  1\G  =  G,:]  = 

Hence  E[^  -  1]  =  2EPC(^)/n. 
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By  applying  Theorem  5.2  to  i.i.d.  Y%  =  (St  -  1  )/(n  -  1)  with  mean  p  =  EPC(Q)/(™),  it 
follows  that  £2  is  an  (e,  5)  approximation  of  EPC(£).  □ 

Lemma  15.  In  any  undirected  probabilistic  graph  Q  =  (  V.  E,p ),  we  have 

J><EPC(S)<  +  • 

e&E  \  e&E  J 

Proof.  We  prove  the  lower  and  upper  bounds  separately. 

Lower  bound'.  By  Lemma  12,  we  have 

EPC(£)  =  X-  Y,  REL- (G) 

u,v£V \u^v 

>  ^2  rel^(^)  >  ^2  Puv 

{u,v)eE  ( U,v)(E:E 

Upper  bound:  First,  we  show  that  EPC(^)  <  HeeEO-  +  Pe)-  Then  we  can  apply  the 
inequality  of  arithmetic  and  geometric  means  [20]  for  positive  numbers  (1  +pe)  Me  e  E  to 
obtain 

)m 

■ 

We  prove  EPC(£)  +  Pe)  by  induction  on  pE  the  number  of  undetermined 

edges  (those  with  probabilities  strictly  less  than  one). 

Basis:  If  pE  =  0,  we  have  a  deterministic  graph  with  m  =  \E\  edges.  Since,  the 
size  of  the  largest  component  cannot  exceed  m  +  1,  the  pairwise  connectivity  is  at  most 
1/2 n(m  +  1)  <  1/2 m(m  +  1)  <  2m  Mm  >  0.  Thus,  the  inequality  holds  for  pE  =  0. 

induction  step:  Assume  that  the  inequality  holds  for  pE  =  t  >  0,  we  show  that  the 
inequality  also  holds  when  pE  =  t  +  1.  Assume  that  pE  =  t  +  1,  select  an  arbitrary 
undetermined  edge  (u,v)  e  E  and  perform  branching  on  (u,v)  as  shown  in  Eq.  5-27. 

We  have 


EPC(e)<n<1+p^  ( 

e€E  \ 


1 

1  H - 

m 


EPC(£)  =  puvEPC(Q+)  +  (1  -  puv)  EPC(£T), 
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where  Q+  is  obtained  from  Q  by  setting  the  (u,v)’s  probability  to  one  and  Q~  is  obtained 
from  Q  by  removing  (u,v).  Since,  both  Q+  and  have  exactly  pE  undetermined  edges, 
we  can  apply  the  induction  hypothesis  to  obtain 

EPC(^)  >  puv(l  +  1)  JJ  (1  +Pe) 

e^(u,v) 

+  (1  -Puv)  n  (1  +Pe) 
e^(u,v) 

= (i +Puv)  n  ^ +p^ = n^1 +p^- 

e^(u,v)  eE.E 

Thus,  the  inequality  holds  for  all  pE  >  0.  □ 

The  bounds  in  Lemma  15  are  asymptotic  tight  in  the  sense  that  there  are  arbitrary 
large  graphs  in  which  the  bounds  are  only  different  from  the  actual  values  of  EPC(£) 
by  a  factor  of  two.  For  example,  consider  Q  as  a  star  graph  of  size  n  that  consists  of 
one  center  vertex  and  n  -  1  leaves.  All  n  -  1  edges  are  assigned  the  same  probability 
l/(n  -  1).  One  can  verify  that  the  lower-bound,  EPC(£?),  and  the  upper  bound  are 
!>  I  -  2{n— i) ’  and  I1  +  <  e’  respectively. 

Theorem  5.4.  Algorithm  12  is  a  fully  polynomial  randomized  approximation  scheme 
(FPRAS)  for  network  connectivity  (for  any  set  of  edge-dependent  failure  probabilities 

{Pe}e£E- 

Proof.  Consider  two  cases  PE  <  n~2~ ^  and  PE  >  n_2_/i  for  some  arbitrary  small  p>0. 


105 


Case  PE  <  n_2_M:  Let  Ph  l  =  0 ..m  be  the  probability  that  the  graph  has  exactly  l 
edge(s).  We  have  0  P,  =  1.  In  addition,  let  P3+  =  ]P™3  P;.  We  have 


Po  =  JJ(1  -  Pe)>l-PE 

(5-1) 

e£E 

/  \ 

*  =  E*Ila  p*)=(E  i-p)Po 

e&E  e/^e  \e£E  / 

(5-2) 

P+  =  1  -  P0  -  Pi  <  1  -  P0{1  +  V  T^—) 

(5-3) 

<  1  -  (1  -  Pe)(1  +  Pe)  =  Pi 

(5-4) 

We  have 

PE  <  EPC(g)  <  Q  .  P„  +  Q  ■  P1  +  Q  Pi  (5-5) 

<  P«t^  +  (t)Pi  (5-6) 

1-Pe  \2J 

<  PE  +  o(l)PE  =  (1  +  o(l))PE  (5-7) 

l  —  rE 

Therefore,  PE  is  an  (e,  ^-approximation  for  EPC(<7). 

Case  PE  >  n_2_M:  From  Theorem  3,  the  component  sampling  procedure  can  give 
an  (e,  6)  approximation  within  a  polynomial  time.  □ 

5.3  Vulnerability  Assessment  using  EPC 

In  this  section,  we  formulate  the  vulnerability  assessment  problems  as  a  mathematical 
programming  problem  and  devise  two  approaches  to  overcome  the  difficulty  of  having 
an  exponential  number  of  constraints  in  the  mathematical  formulation. 

Linear  programming  for  deterministic  networks.  Given  a  realization  Gl  of  Q,  the 
k- CNP  problem  in  Gl  can  be  formulated  as  an  integer  linear  programming  (ILP), 
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following  [11], 


min  ^^(1  —  Xij )  (5—8) 

i<j 

n 

S.  t.  Si  <  k 

i= 1 

Xij  <  Si  +  Sj  +  1  -  f{j,  (i,  i)  G  E 
Xij  T  Xjk  P  i ■  .7 ■  '  1..TZ 

Xij  ^5  J  1..7Z 

s  G  {0,  l}n,ic  G  {0,  l}"2  (5-9) 

where  s*  =  1  if  vertex  i  is  removed  and  s*  =  0,  otherwise;  xl3  represents  the  “disconnec- 
tivity”  between  a  pair  of  nodes  i  and  j  after  removing  {i  e  V\ st  =  1}  i.e.  xtJ  =  1  if  there  is 
no  i-  j  path  and  xl3  =  0,  otherwise. 

For  small  networks,  the  P(s,x ,£l)  can  be  solved  optimally  using  branch-and-bound 
methods.  Moreover,  further  enhancement  can  be  used  to  enable  finding  optimal 
solutions  for  larger  network  of  several  hundreds  of  nodes.  These  include  the  sparse 
metric  methods  in  [30],  that  reduces  the  number  of  constraints  from  0(n3)  to  0(mn),  the 
removal  of  the  integral  conditions  on  x%3  that  reduces  the  number  of  integral  variables 
from  0(n2)  to  0(n),  and  specialized  cutting  planes  [30,  45]. 

Two-stage  stochastic  programming.  We  propose  for  the  fc-pCNP  problem  a 
two-stage  stochastic  programming.  Stochastic  programming  has  been  a  common 
approach  for  optimization  under  uncertainty  when  the  probability  distribution  governs 
the  data  is  given.  A  comprehensive  introduction  to  stochastic  programming  can  be 
found  in  reference  [69].  Our  formulation  is  presented  follows  with  the  two  highlighted 
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enhancements  from  [30]. 


min  E[P(s,x,  £)]  (5-10) 

seio,!}" 

n 

s.  t .J2si<k  (5-"! 1 ) 

i=l 

where  P(s,x,  £)  =  min  ^(1  —  Xij)  (5-1 2) 

i<j 

S.  t.  <  Si  +  sj  +  1  -  (i,j)  e  E,  (5-13) 

Xij  +  xjk  >  xik,  (i, j)  eE.k  =  l..n  (5-14) 

Xij  —  Xjii  h  j  —  l--n  (5—15) 

se{0,l}n,xG  [0,l]n2  (5-16) 


The  optimal  decision  on  which  set  of  vertices  to  remove  only  depends  on  the 
given  topology  and  the  edge  probabilities  but  not  the  future  observations  on  the  edge 
availabilities.  First  stage  variables  s  are  to  be  decided  before  the  actual  realization  of  the 
uncertain  parameters  in  the  adjacency  matrix  £.  Our  objective,  the  pairwise  connectivity 
in  the  residual  graph,  involves  only  the  expected  cost  of  x,  the  variable  in  the  second 
stage  (or  recourse)  variables. 

Discretization.  To  solve  the  stochastic  program  numerically,  one  needs  to  consider 
all  possible  realization  Gl  e  Sg  and  their  probability  masses  fg(Gl).  Then  the  two-stage 
stochastic  program  can  be  written  as  an  (one-level)  mixed  integer  programming, 
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denoted  by  MIPF: 


N 


min  ^{Gl)  5^(1  -  x\j) 

1=1  i<j 

(5-17) 

S.  t.  Si  <  k 

i=  1 

(5-18) 

x\j  <  st  +  s3  +  1  -  (i,j)  eE,l  =  1..N 

(5-19) 

x\j  +  x)k  >  x\k ,  (i,j)  e  E,k  =  l..n,  l  =  1..N 

(5-20) 

x\j  =  xlji,  i,j  =  l..n,l  =  l..N 

(5-21) 

s  E  {0,  l}n,  xl  £  [0,l]n2,  l  =  1..N 

(5-22) 

The  major  challenge  in  solving  this  discretized  form  is  that  there  are  exponential  number 
(N  =  2|s|)  of  variables  and  constraints.  Thus,  solving  MIPF  is  intractable  even  for  very 
small  instances  of  Q.  To  overcome  this  difficulty,  we  present  in  next  two  subsections  two 
approximate  mathematical  programs  of  substantially  smaller  sizes. 

5.3.1  Approximating  via  the  Expectation  Graph 

Many  clustering  and  optimization  problem  on  probabilistic  graphs  can  be  reduced 
into  equivalent  problems  on  the  (deterministic)  expectation  graph,  constructed  by 
casting  the  edge  probabilities  into  weights.  For  example,  the  expectation  graph  of  Q  is  a 
(deterministic)  graph  with  the  weighted  matrix  £,  where  \%3  =  pt].  The  main  challenge  in 
this  approach  is  how  to  interpret  the  weights  in  a  meaningful  way. 
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Our  first  approach  is  to  regard  the  weighted  matrix  £  as  a  (binary)  adjacency  matrix 
of  a  deterministic  graph.  Thus  we  obtain  the  following  MIP,  denoted  by  MIPF. 

min  —  Xij)  (5—23) 

i<j 

n 

S.  t.  i  <  k  (5—24) 

2—1 

Xij  <  Si  +  Sj  +  1  -£ij,  {i,j)eE  (5-25) 

Constraints  (5 - 14),  (5 - 15),  &  (5 - 16) 

Not  only  this  relaxation  has  polynomial  time  numbers  of  constraints  and  variables,  but 
also  its  optimal  solution  provides  a  lower-bound  on  the  optimal  solution  of  MIPF,  as 
stated  in  the  following  lemma. 

Lemma  16.  MIPE  is  a  mixed  integer  programming  with  at  mostn  integral  variables  and 
0(mn )  constraints.  Moreover,  the  objective  of  the  MIPE  is  at  most  that  of  the  MIPF. 

Proof.  The  number  of  integral  variables  and  constraints  can  be  proven  similar  to  [30]. 

To  show  that  the  the  objective  of  the  MIPF  is  a  lower-bound  on  that  of  the  WPF,  we 
construct  a  feasible  solution  (s,x)  of  MIPF  that  gives  an  objective  equal  to  the  optimal 
objective  of  MIPF. 

Let  (s,  x1, . . . ,  xN)  be  an  optimal  solution  of  the  MIPF.  Construct  a  solution 
(s  =  s,x  =  fg(Gl)%iy  The  objective  value  of  MIP^  given  by  that  solution  is 

N 

i<j  i<j  1=1 

N 

i<j  1=1 

which  is  exactly  the  optimal  objective  of  MIPF.  The  last  equality  holds  because  the 
probabilities  fg(Gl)  add  up  to  one. 
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The  rest  is  to  show  that  (s,x)  is  a  feasible  solution  of  MIPS.  Clearly,  §  satisfy  (5-24) 
and  the  integral  constraints.  Also  since  x  is  a  convex  combination  of  xl ,  l  =  1..N  with  the 
masses  fg{Gl),  x  satisfy  the  constraints  (5-25),  (5-14),  (5-15),  &  (5-16)  as  they  can  be 
inferred  from  the  same  convex  combination  of  the  constraints  (5-1 9)  to  (5-22).  □ 

One  can  solve  M\PE  optimally  using  the  branch-and-cut  method  in  [30]  to  obtain 
1)  a  set  of  A;  critical  nodes  and  2)  a  lower-bound  on  the  minimum  expected  pairwise 
connectivity  after  removing  k  nodes.  We  note  that  the  non-integrality  of  xl:j  is  essential 
for  MIPb.  When  xV]  is  restricted  to  {0, 1},  e.g.  in  (5-9),  the  constraints  (5-25)  is 
essentially  <  s*  +  sj  and  MIP^  become  equivalent  to  IP  (5-8)-(5-9).  That  is  the 
information  encoded  in  the  edge  probabilities  is  disregarded  and  only  the  network 
topology  is  used  in  the  formulation.  In  addition,  since  the  convex  combination  of  xl  is  a 
fractional  vector,  we  will  not  be  able  to  derive  the  lower-bound  given  in  Lemma  16. 

In  large  networks,  branch-and-cut  algorithm  starts  to  show  its  exponentially  running 
time,  the  following  randomized  rounding  algorithm  can  be  used  to  obtain  a  set  of  k 
critical  nodes.  The  rounding  procedure  is  described  in  Algorithm  11.  The  algorithm 
repeatedly  solves  an  LP  relaxation  of  MIP^  and  round  up  the  maximum  s,  to  one, 
provided  that  is  not  rounded  before.  After  k  steps,  k  nodes  that  have  s*  =  1  are 
retuned  as  the  set  of  the  critical  nodes.  Since  the  LP  relaxation  has  at  most  0(mn ) 
constraints  solving  the  LP  relaxation  takes  an  0{m3n3)  time  [30]  in  the  worst  case.  Thus 
the  total  time  complexity  is  at  most  0(km3n3). 
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Algorithm  13.  Rounding  on  the  Expectation  Graph  Algorithm 
(REGA) 

1.  Obtain  an  LP  relaxation  of  MIP^  with  the  relaxed  constraints 

s  G  [0,  l]n. 

2.  Initialize  the  set  of  selected  nodes  D  =  0. 

3.  Repeat  k  times  the  following  steps 

•  Solve  the  LP  relaxation 

•  Select  u  =  arg  max  G  V  \  Dsi . 

•  Add  u  to  D  and  fix  su  =  1 

4.  Return  k  critical  nodes  in  D. 


5.3.2  Sample  Average  Approximation  (SAA)  Method 

Our  second  approach  to  reduce  the  number  of  realizations  is  to  apply  the  Sample 
Average  Approximation  (SAA)  method.  We  generate  independently  T  samples 
f1,  £2,  •  •  •  ,  using  Monte  Carlo  simulation  (i.e.  to  generate  each  edge  (u,  v)  g  e 
with  probability  puv).  The  expectation  objective  q(s)  =  E[P(s,x,£)]  is  then  approximated 
by  the  sample  average  qT{x)  =  T  Ya=\  ^<^-(1  -  x\3),  and  the  new  formulation  is  then 

min 

s.  t.  Constraints(5 - 18)  -  (5  — 22),  replacing  N  with  T 

Under  some  regularity  conditions  T  Ef=i  E i<jil  ~  x\j)  converges  pointwise  with 
probability  1  to  E[P(s,x,0]  as  T  ->  oo.  Moreover,  an  optimal  solution  of  the  sample 
average  approximation  provides  an  optimal  solution  of  the  stochastic  programming  with 
probability  approaching  one  exponentially  fast  w.r.t.  T.  Formally,  denote  by  s*  and  3  the 
optimal  solution  of  the  stochastic  programming  and  the  sample  average  approximation, 


112 


respectively.  For  any  e  >  0,  it  can  be  derived  from  Propostition  2.2  in  [52]  that 

Pr[E  [P(s,x,0]-E[P(s,x*,0]>e] 

.  (  t62 
-  6XP  \-Tn* 

Equivalently,  if  T  >  nk  -  toga),  then  Pr  [E  [P(s,£,£)]  -  E  [P(s,z*,£)]  <  e]  >  1  -  a 

for  any  a  e  (0, 1).  Although  the  estimation  on  T  maybe  too  conservative  for  practical 
estimates,  it  is  expected  that  the  optimal  value  and  optimal  solutions  of  the  SAA  problem 
converge  to  their  counterparts  even  with  a  reasonable  small  value  of  T.  The  description 
for  SAA  method  is  summarized  in  Algorithm  14.  The  algorithm  consists  of  two  phases. 

In  the  first  phase,  the  delayed  constraints  technique  is  used  to  incrementally  construct 
and  solve  an  LP  relaxation  of  the  SAA.  In  the  second  phase,  the  same  iterative  rounding 
procedure  in  Algorithm  1  is  applied  to  find  k  critical  nodes  by  rounding  up  the  fractional 
solution. 


+  n  log  k 


(5-26) 


113 


Algorithm  14.  Sample  Ave.  Approx.  Algorithm  (SA3) 

Parameter  T\  the  number  of  sampling 
Phase  1 :  Delayed  Constraints 

1 .  Initialize  an  LP  with  the  objective  T  Yl=i  1_a4)  ancl  onlY 
the  constraints  s  e  [0,  l]n,xL  e  [0, 1]. 

2.  for  l  =  1..T  do 


•  Generate  the  Ith  sample  of  Q  (adjacency  matrix  £*). 

•  Add  the  constraints  involved  x\,  to  the  LP 

•  Solve  the  updated  LP 
Phase  2:  Iterative  rounding 

3.  Initialize  the  set  of  selected  nodes  D  =  0. 

4.  Repeat  k  times  the  following  steps 

•  Select  u  =  arg  max  st. 

ieV\D 

•  Add  u  to  D  and  fix  su  =  1 

•  Re-solve  the  LP 

5.  Return  k  critical  nodes  in  D. 


5.3.3  Local  Search  Heuristic 

A  local  search  method  described  in  Algorithm  15  to  find  the  k  critical  nodes.  The 
algorithm  selects  k  nodes  in  a  greedy  manner:  each  time  the  algorithm  selects  the  node 
that  removal  results  in  the  largest  degradation  in  terms  of  EPC.  Moreover,  the  algorithm 
attempts  to  swap  a  node  w  outside  the  disruptor  with  a  node  u  in  the  disruptor  that 
gives  the  sharpest  decrease  in  EPC.  The  local  search  terminates  when  no  improvement 
exists. 

Proof  of  Lemma  12 


114 


Algorithm  15.  Iterative  Greedy  Algorithm  (IGA) 

1: 

2:  for  i  =  l..k  do 

3:  u  =  arg  min  EPC(G[y\(suM)]) 

vev\s 

4:  D  <r-  D  +  {u} 

5:  while  3 (u,  v)  e  D  x  (V  \  D ) 

6:  &  (swapping  u,  v  decreases  the  objective)  do 

7:  D  <-  D  +  {n}  -  {«} 

8:  Output  D. 


Proof.  This  is  derived  directly  from  the  definition  of  EPC(£?).  Define  conn.ut)(Gi)  =  1,  if 
there  is  a  path  between  u  and  v  in  a  sample  graph  Gl  and  connw(G')  =  0,  otherwise. 

We  have 

N  N  1 

EPC(S)  =  J2  fg{G‘)V{G‘)  =  Y.  MG‘)  2  E  conrUG‘) 

/=1  i—  1  u^v 

l  -  l 

=  2^E  fg(Gl)connuv(Gl)  =  -  ^  REL„,„(£) 

u^v  1=1  u^v 

□ 

Proof  of  Lemma  13 

Proof.  An  edge  (u,v)  e  EB  is  said  to  be  undetermined,  if  0  <  ptvPuv  <  1-  We  prove  the 
lemma  by  induction  on  the  number  of  undetermined  edges  within  EB,  denoted  by  pB. 

Basis :  If  /j,b  =  0,  both  QA  and  QB  are  deterministic  graph,  and  the  statement  holds 
trivially.  Assume  that  the  lemma  is  true  when  nB  =  t  >  0.  We  show  that  the  lemma  is 
also  true  when  /j,b  =  t  +  l. 

Induction  step:  Assume  that  pB  =  t  +  1,  pick  an  arbitrary  edge  (u,v)  e  EB  and 
consider  the  branching  on  (u,v): 

EPC(ft.)  =  pi  EPC(5+)  +  (1  -pi)EPC(a.J).  (5-27) 
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where  Q\  is  obtained  from  QA  by  assigning  puv  =  1;  and  QA  is  obtained  from  QA  by 
removing  the  edge  (u,v). 

Similarly,  we  have 

EPC(Sb)  =  Pi  EPC(^)  +  (1  -  p®  )EPC(S£). 

Since  QA  -<  QB,  it  can  be  verified  that  Q\  -<  and  Ga  =<  Qb.  Note  that  the  pairs 
(Ga,Gb)  ancl  {Ga,Gb)  have  at  most  t  undetermined  edges.  By  the  induction  hypothesis, 
we  have 


EPC {Qb)  >  p£,EPC(G+)  +  (1  -  p^)EPC(^) 

>  J&EPC(#)  +  (1  -pi)EPC(^)  =  EPC(£A). 

The  last  inequality  holds  because  p£v  >  p%v  and  EPC(C^)  >  EPC(^t)  which  can 
be  shown  based  on  the  fact  that  each  sample  of  graph  Q\  can  be  generated  by  first 
generating  a  sample  of  Q~A  and  then  add  (u,v)  the  sample.  Obviously,  adding  an  edge 
to  a  (deterministic)  graph  will  not  decrease  the  pairwise  connectivity.  Thus,  the  lemma 
holds  for  all  pB  >  0.  □ 
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CHAPTER  6 

CASCADING-FAILURES  IN  NETWOKRS 
In  this  chapter,  we  formulate  the  measuring  vulnerabiltiy  in  the  presence  of 
cascading-failure  as  an  optimization  problem:  the  Cost-effective,  massive  and  outbreak 
problem  (CFM).  In  Section  6.1,  we  analyze  the  propagation  process  on  power-law 
networks  to  give  an  lower-bound  on  the  seeding  size.  We  present  VirAds,  a  scalable 
algorithm  to  find  a  minimal  seeding  for  the  CFM  problem  in  Section  6.2.  The  hardness 
of  finding  a  cost-effective  seeding  is  addressed  in  Section  6.3.  Finally,  we  perform 
extensive  experiments  on  large  social  networks  such  as  Facebook  and  Orkut  to 
confirm  the  efficiency  of  our  proposed  algorithm  and  analyze  the  results  to  give  new 
observations  to  information  diffusion  process  in  networks. 

6.1  Seeding  Cost  of  Massive  Outbreak 
In  this  section,  we  exploit  the  power-law  topology  found  in  most  complex  networks 
[1 2,  1 3,  25]  to  demonstrate  that  when  the  propagation  hop  is  limited,  a  large  number 
of  seeding  nodes  is  needed  to  spread  the  influence  throughout  the  network.  The 
size  of  seeding  is  proved  to  be  a  constant  fraction  of  the  number  of  vertices  n,  which 
is  prohibitive  for  large  social  networks  of  millions  of  nodes.  We  first  summarize  the 
well-known  power-law  model  in  [3];  then  we  use  the  model  to  prove  the  prohibitive 
seeding  cost  for  the  CFM  problem. 

6.1.1  Power-law  Network  Model. 

Many  complex  systems  of  interest  including  OSNs  are  found  to  have  the  degree 
distributions  approximately  follows  the  power  laws  [12,  13,  25].  That  is  the  fraction  of 
nodes  in  the  network  having  k  connections  to  other  nodes  is  proportional  to  /c-7,  where 
7  is  a  parameter  whose  value  is  typically  in  the  range  2  <  7  <  3.  Those  networks  have 
been  used  in  studying  different  aspects  of  the  scale-free  networks  [3,  5,  39,  41],  We 
follow  the  P(a,  7)  power-law  model  in  [3]  in  which  the  number  of  vertices  of  degree  k 


117 


is  [gj  where  e"  is  the  normalization  factor.  For  convenience,  we  shall  refer  to  such  a 
network  as  a  P(a,  7)  network. 

We  can  deduce  that  the  maximum  degree  in  a  P(a,  7)  network  is  e?  (since  for 
k  >  ei,  the  number  of  edges  will  be  less  than  1).  The  number  of  vertices  and  edges  are 

if  7  >  1 
if  7  =  1  j 
if  7  <  1 

if  7  >  2 

if  7  =  2  (6-1 ) 

if  7  <  2 

where  ((7)  =  X“i  P  is  the  Riemann  Zeta  function  [3]  which  converges  for  7  >  1  and 
diverges  for  all  7  <  1.  Without  affecting  the  conclusion,  we  will  simply  use  real  numbers 
instead  of  rounding  down  to  integers.  The  error  terms  are  sufficiently  small  and  can  be 
bounded  in  our  proofs. 

While  the  scale  of  the  network  depends  on  a,  the  parameter  7  decides  the 
connection  pattern  and  many  other  important  characterizations  of  the  network.  For 
instance,  the  larger  7,  the  sparser  and  the  more  “power-law”  the  network  is.  Hence,  the 
parameter  7  is  often  regarded  as  the  characteristic  constant  for  scale-free  networks. 

6.1.2  Prohibitive  Seeding  Costs 

We  prove  that  the  seeding  must  contain  at  least  tt(n)  vertices  if  the  propagation  is 
locally  bounded.  The  result  is  stated  in  the  following  theorem. 

Theorem  6.1.  Given  a  power-law  network  G  e  P (a,  7),  with  7  >  2  and  constant 
0  <  p  <  1,  any  d-seeding  is  of  size  at  least  tt(n) . 

Proof.  The  proof  consists  of  two  parts.  In  the  first  part,  we  show  that  the  volume  i.e.  the 
total  degree  of  vertices,  of  any  d-seeding  must  be  f l{m).  In  the  second  part,  we  prove 
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Figure  6-1.  The  influence  propagation  in  the  network. 

that  any  subset  of  vertices  S  c  V  with  volume  vol(S)  =  Q(m)  in  a  power-law  network 
with  power-law  exponent  7  >  2,  will  imply  that  \S\  =  fl(n).  Thus,  the  theorem  follows. 

In  the  first  part,  we  consider  two  separate  cases 

Case  p  Let  S  —  Rq  be  the  optimal  solution  for  the  CFM  problem  on  G  —  (V,  E ), 
and  5  =  R0,R1,R2,...,  Rd  are  vertices  that  become  active  at  round  0, 1, 2 
respectively  (see  Fig.  6-4).  Notice  that  {R,}f=0  form  a  partition  of  V.  Moreover,  for  each 
1  <  t  <  d  the  following  inequality  holds. 

d 

u R 

j=t+ 1 

where  <j)(A,  B)  denotes  the  set  of  edges  connecting  one  vertex  in  A  to  one  vertex  in  B. 
The  inequality  means  that  at  least  a  fraction  ^  among  edges  incident  with  the  vertices 
activated  in  round  t  must  be  incident  with  active  vertices  in  the  previous  rounds. 

Sum  up  all  inequalities  in  (6-2)  for  t  =  1.  A,  we  have 

d  t—  1  d  /  d 

U  R>) \+mzt,Rt)\ 

t= 1  i=0  P  t= 1  V  j=t+ 1 


t- 1 


\ct>(Ru{jRi)\ 


> 


p 


i= 0 


1  ~  P 


1 4>{Rt, 


;)l  +  2|0(i?t,  Rt)\ 


(6-2) 
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Eliminate  the  common  factors  in  both  sides,  we  have 


d—  1  d 

U  ft)i 

2—0  £=2+1 

d—  1  d  d—  1 

>  +-  U  fl,)i  +  2Eiw.«<)i 

P  j= i  i=j+i  t=i 


After  some  algebra,  we  obtain 

d 

vol(i?0)  >  |0(i2o,lJ^)l 

t=i 

q  d —  1  d  d 

3—1  1-3  I  1  1=1 

«■  +-1^0,101  -|^(flo,-Ro)| 

>++|  +  ++Wfl,,.fiy|  (6-3) 

i-p  wfe 

Hence,  when  p  >  1/2,  vol(i?0)  >  ^+|-E|  =  O(m)  for  any  ^-seeding  i?0. 

Case  p  <  i  We  say  that  an  edge  is  active  if  it  is  incident  to  at  least  one  active 
vertex.  At  round  £  =  0,  there  are  at  most  vol(i?0)  active  edges,  those  who  are  incident 
to  Rq.  Eq.  6-2  implies  that  the  number  of  active  edges  in  each  round  increases  at  most 
p~l  times.  After  d  rounds,  the  number  of  active  edges  will  be  bounded  by  vol(i?0)  x  p~d. 
Since,  all  edges  are  active  at  the  end  we  have  the  inequality: 

vol(i?o)  >  P~d\E\. 

In  the  second  part  of  the  proof,  we  show  that  if  a  subset  S  c  V  has  vol(S')  =  Q(m),  then 
l^l  =  fi(n)  whenever  the  power-law  exponent  7  >  2.  Assume  that  vol(S')  >  cm,  for  some 
positive  constant  c.  The  size  of  S  is  minimum  when  S  contains  only  the  highest  degree 
vertices  of  V.  Let  k0  be  the  minimum  degree  of  vertices  in  5  in  that  extreme  case,  by  Eq. 
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6-1  we  have 


cm  =  - 


e7  a  ,  eT" 

c  ,  e  , .  1  ,  e 

-  k—  <  vol(£)  <  -  k  — 

2  ^  &  ~  K  1  ~  2  ^  k~i 

k= 1  k=ko 


Simplify  two  sides,  we  have 
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Since,  the  zeta  function  ((7  -  1)  converges  for  7  >  2,  there  exists  a  constant  kpn  that 
depends  only  on  p  and  7  that  satisfies 

kp,  7 

ErFi>(1-c)c(7_1) 

k=l 

Obviously,  we  have  <  kpn.  Thus,  the  number  of  vertices  that  are  in  S  is  at  least 


a 


k — kp/ y 


Q(n) 


We  have  the  last  step  because  the  sum  Ylk=i  h  is  bounded  by  a  constant  since  kprf  is  a 
constant.  □ 


In  both  cases  p  >  1/2  and  p  <  1/2,  the  size  of  a  d-seeding  set  is  at  least  Vt(n). 
However,  we  can  see  a  clear  difference  in  the  propagation  speed  with  respect  to 
d  between  two  cases.  When  p  <  1/2,  the  number  of  active  edges  can  increase 
exponentially  (but  is  still  bounded  if  d  is  a  constant)  and,  it  is  likely  that  the  number 
of  active  vertices  also  exponentially  increases.  In  contrast,  when  p  >  1/2,  exploding  in 
the  number  of  active  edges  (and  hence  active  vertices)  is  impossible  as  the  volume  of 
the  d-seeding  is  tied  to  the  number  of  edges  m  by  a  fixed  constant  regardless  of 
the  value  of  d. 

6.2  Algorithm  to  Identify  the  Minimum  Outbreak  Seeding 

In  order  to  understand  the  influence  propagation  when  the  number  of  propagation 
hops  is  bounded,  we  propose  VirAds,  an  efficient  algorithm  for  the  CFM  problem.  With 
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the  huge  magnitude  of  OSN  users  and  data  available  on  OSNs,  scalability  becomes  the 
major  problem  in  designing  algorithm  for  CFM.  VirAds  is  scalable  to  network  of  hundred 
of  millions  links  and  provides  high  quality  solutions  in  our  experiments. 

Before  presenting  VirAds,  we  consider  a  natural  greedy  for  the  CFM  problem 
in  which  the  vertex  that  can  activate  the  most  number  of  inactive  vertices  within  d 
hops  is  selected  in  each  step.  This  greedy  is  unlikely  to  perform  well  on  practice  for 
following  two  reasons.  First,  at  early  steps,  when  not  many  vertices  are  selected,  every 
vertex  is  likely  to  activate  only  itself  after  being  chosen  as  a  seed.  Thus,  the  algorithm 
cannot  distinguish  between  good  and  bad  seeds.  Second,  the  algorithm  suffers  serious 
scalability  problems.  To  select  a  vertex,  the  algorithm  has  to  evaluate  for  each  vertex  v 
how  many  vertices  will  be  activated  after  adding  v  to  the  seeding,  e.g.  by  invoking  an 
0{m  +  n)  Breadth-First  Search  procedure  rooted  at  v.  In  the  worst-case  when  0{n) 
vertices  are  needed  to  evaluate,  this  alone  can  take  0{n[m  +  n )).  Moreover,  as  shown  in 
the  previous  section,  the  seeding  size  can  be  easily  Vt{n)\  thus,  the  worst-case  running 
time  of  the  naive  greedy  algorithm  is  0(n2(m  +  n)),  which  is  prohibitive  for  large-scale 
networks. 

As  shown  in  Algorithm  16,  our  VirAds  algorithm  overcomes  the  mentioned  problems 
in  the  naive  greedy  by  favoring  the  vertex  which  can  activate  the  most  number  of  edges 
(indeed,  it  also  considers  the  number  of  active  neighbor  around  each  vertex).  This 
avoids  the  first  problem  of  the  naive  greedy  algorithm.  At  early  steps,  the  algorithm 
behaves  similar  to  the  degree-based  heuristics  that  favors  vertices  with  high  degree. 
However,  when  a  certain  number  of  vertices  are  selected,  VirAds  will  make  the  selection 
based  on  the  information  within  d-hop  neighbor  around  the  considered  vertices  rather 
than  only  one-hop  neighbor  as  in  the  degree-based  heuristic. 

The  scalability  problem  is  tackled  in  VirAds  by  efficiently  keeping  track  of  the 
following  measures  for  each  vertex  v. 

•  rv:  the  round  in  which  v  is  activated 
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Algorithm  16:  VirAds:  Finding  Influence  Nodes  in  Networks 
Input:  Graph  G  =  (V,  E),  0  <  p  <  1,  d  e  N+ 

Output:  A  small  (/-seeding 

n'f1  •(—  d(v),nia ^  <r-  p  ■  d(v),rv  <—  d  +  1,  v  e  V ; 

Tv'1  =  0  ,i  —  0  ..d,  P  •(—  0; 

while  there  exist  inactive  vertices  do 

repeat 

U  4r-  argmax^p  {n^  +  n[a)}; 

Recompute  nie)  as  the  number  of 
new  active  edges  after  adding  u. 
until  u  =  argmaxv(^P  {n^  + 

P  ^Pu{u}; 

Initialize  a  queue:  Q  <-  {(u,rv)}] 
ru  0; 

foreach  xeN(u)  do 
rix  ]  <-  max{nia)  -  1,  0}; 

while  Q  ^  0  do 

(t,rt)  <-  Q-Pop{) ; 

foreach  w  e  N(t)  do 

foreach  i  =  rt  to  min{ft  -  1,  rw  -  2}  do 

Tw  =  Tw  +  1; 

if  (ru)  >  p  ■  dw)  A  (rw  >  rf)  A  (i  +  1  <  d)  then 
foreach  x  e  N(w)  do 
J  fix  '1  <r-  rnax{niai  -  1,0}; 

rw  =  i  +  1; 

if  w  £  Q  then 

[_  Q.push((w,rw )); 

Output  P; 


•  nie The  number  of  new  active  edges  after  adding  v  into  the  seeding 

•  nia):  The  number  of  extra  active  neighbors  v  needs  in  order  to  activate  v 

•  ril}  :  The  number  of  activated  neighbors  of  v  up  to  round  i  where  i  =  l..d. 

Given  those  measures,  VirAds  selects  in  each  step  the  vertex  u  with  the  highest 

effectiveness  which  is  defined  as  nie)  +  nia).  After  that,  the  algorithm  needs  to  update  the 
measures  for  all  the  remaining  vertices. 
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Except  for  nie\  we  show  that  all  other  measures  can  be  effectively  kept  track  of  in 
only  0((m  +  n)d)  during  the  whole  algorithm.  When  a  vertex  u  is  selected,  it  causes 
a  chain-reaction  and  activate  a  sequence  of  vertices  or  lower  the  rounds  in  which 
vertices  are  activated.  New  activated  vertices  together  with  their  active  rounds  are 
successively  pushed  into  the  queue  Q  for  further  updating  much  like  what  happens  in 
the  Bellman-Ford  shortest-paths  algorithm.  Everytime  we  pop  a  vertex  v  from  Q,  if  rv, 
the  current  active  round  of  v,  is  different  from  fv,  the  active  round  of  v  when  v  is  pushed 
into  Q,  we  update  for  each  neighbor  w  of  v  the  values  of  rw  and  r$.  If  any  neighbor  w  of 
v  changes  its  active  round  and  w  is  not  in  Q,  we  push  w  into  Q  for  further  update.  The 
update  process  stops  when  Q  is  empty.  Note  that  for  each  node  u  e  V,  changing  of  ru 
can  cause  at  most  d  update  for  A;)  where  w  is  a  neighbor  of  u.  For  all  neighbors  of  u,  the 
total  number  of  update  is,  hence,  0(d,  ■  d(u)).  Thus,  the  total  time  for  updating  Ai)  Vw  e  V 
in  VirAds  will  be  at  most  0((m  +  n)  ■  d). 

To  maintain  n{A\  the  easiest  approach  is  to  recompute  all  nie).  This  approach, 
called  Exhaustive  Update,  is  extremely  time-consuming  as  discussed  in  the  naive 
greedy.  Instead,  we  only  update  n{A  when  “necessary”.  In  details,  vertices  are  stored  in 
a  max  priority  queue  in  which  the  priority  is  their  effectiveness.  In  each  step,  the  vertex 
u  with  the  highest  effectiveness  is  extracted  and  n ie)  is  recomputed.  If  after  updating,  u 
still  has  the  highest  effectiveness,  u  is  then  selected.  Otherwise,  u  is  pushed  back  to  the 
priority  queue,  and  the  new  vertex  with  the  highest  effectiveness  is  considered,  and  so 
on. 

Approximation  Ratio  for  Power-law  Networks. 

The  CFM  problem  can  be  easily  shown  to  be  NP-hard  by  a  reduction  from  the  set 
cover  problem.  Thus,  we  are  left  with  two  choices:  designing  heuristics  which  have  no 
worst-case  performance  guarantees  or  designing  approximation  algorithms  which  can 
guarantee  the  produced  solutions  are  within  a  certain  factor  from  the  optimal.  Formally, 
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a  /3-  approximation  algorithm  for  a  minimization  (maximization)  problem  always  returns 
solutions  that  are  at  most  0  times  larger  (smaller)  than  an  optimal  solution. 

Unfortunately,  there  is  unlikely  an  approximation  algorithm  with  factor  less  than 
O(logn)  as  shown  in  next  section.  However,  if  we  assume  the  network  is  power-law,  our 
VirAds  is  an  approximation  algorithm  for  CFM  with  a  constant  factor. 

Theorem  6.2.  In  power-law  networks,  VirAds  is  an  0(1)  approximation  algorithm  for  the 
CFM  problem  for  bounded  value  of  d. 

The  theorem  follows  directly  from  the  result  in  previous  section  that  the  optimal 
solution  has  size  at  least  Q,(n)  in  power-law  networks.  Thus,  the  ratio  between  the 
VirAds’s  solution  and  the  optimal  solution  is  bounded  by  a  constant. 

6.3  Hardness  of  the  CFM  Problem 

This  section  provides  the  hardness  of  approximating  the  optimal  solutions  of  the 
CFM  problem,  the  impossibility  of  finding  near-optimal  solutions  in  polynomial  time. 

In  previous  Section,  we  can  obtain  0(1)  approximation  algorithms  for  CFM  when  the 
network  is  power-law.  However,  without  the  power-law  assumption,  there  is  no  algorithm 
that  can  approximate  the  problem  within  a  factor  less  than  O(logn).  We  first  prove  the 
hardness  for  the  case  when  d  =  1,  which  is  an  essential  step  in  proving  the  hardness  for 
the  general  case  d>  1.  We  begin  with  the  Feige’s  reduction  for  proving  In n  threshold  for 
the  set  cover  problem.  Our  proof  for  the  hardness  of  approximation  for  the  CFM  problem 
requires  understanding  the  Feige’s  construction  together  with  its  parameter  settings. 
6.3.1  Feige’s  Reduction  for  Set  Cover 

Feige  presented  a  reduction  from  a  k- prover  proof  system  for  a  MAX  3SAT-5 
instance  0  that  is  a  conjunctive  normal  form  formula  consists  of  n  variables  and 
clauses  of  exactly  3  literals.  The  verifier  interacts  with  k  provers,  and  ask  provers 
different  questions  based  on  a  random  string  r;  each  question  involves  1/2  clauses  and 
1/2  variables.  If  the  formula  0  is  satisfiable,  then  the  provers  have  a  strategy  that  cause 
the  verifier  accepts  for  all  random  strings.  If  only  a  (1  -  e)  fraction  of  the  clauses  in  0 
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are  simultaneously  satisfiable,  then  for  all  strategies  of  the  provers,  the  verifier  weakly 
accept  with  a  probability  at  most  k 2  •  2~cl,  where  c  is  a  constant  that  depends  only  on  e. 

The  core  of  the  Set  cover  gadget  is  a  partition  system  B(m,  L,  k,  d ),  where  B  is  a 
ground  set  of  m  points.  The  partition  system  is  a  collection  of  L  =  2l  partitions  P1:...,PL 
of  B,  each  partition  Pt  has  exactly  k  disjoint  subsets  pi}1, . . .  ,pitk.  Any  cover  of  m  points 
in  B  requires  at  least  d  =  ( 1-  \)k\nm  subsets.  The  condition  to  make  constructing  such 
a  system  possible  is  that  k  <  3I^m. 

Let  R  =  (5 n)1  denote  the  number  of  possible  random  strings  for  the  verifier.  We 
make  R  copies  of  partition  system  B.  Let  Br  denote  the  copy  of  the  partition  associated 
with  the  random  string  r  and  prid  the  copy  of  set  pid  in  Br. 

We  now  ready  to  describe  the  instance  of  Set  Cover  in  the  Feige’s  reduction. 

The  universal  seXU  =  (J  Br  contains  N  =  \U\  =  mR  points;  and  the  set  system  is 

r£R 

S  =  {Sqjaji}qa,  where  %  can  be  deduced  from  syntax  of  (q,a).  Each  set  SqA)i  corresponds 
to  a  question-answer  pair  (q,  a)  of  the  ith  prover  and  SqM)l  =  [J  pr0r  i  where  (q,  i)  e  r 

( q,i)€r 

means  on  random  string  r,  the  ith  prover  receives  question  q,  and  ar  is  the  assignment 
of  variables  extracted  from  a. 

As  long  as  k2 2~cl  <  fc3 m ,  we  obtain  the  hardness  result  (1  -  |)  In m  i.e.  if  formula 
(p  is  satisfiable,  then  mR  points  in  U  can  be  covered  by  kQ  subsets,  and  if  only  (1  -  e) 
fraction  of  the  clauses  are  simultaneously  satisfiable,  the  minimum  set  cover  has  size  at 
least  (1-  f)  lnm  kQ.  Here,  Q  is  the  set  of  all  nl  (5/3)z/2  possible  questions.  The  condition 
can  be  satisfied  with  /  >  \  (5  log  k  +  2  log  lnm). 

The  hardness  ratio  (1  -  f(k))  In  m  of  the  set  cover  is  obtained  from  the  following  key 
lemma. 

Lemma  17.  (Lemma  4. 1  [38])  If  o  is  satisfiable,  then  the  above  set  of  N  =  mR  points 
can  be  covered  by  kQ  subsets.  If  only  a  (1  -  e)  fraction  of  the  clauses  in  o  are  simul¬ 
taneously  satisfiable,  the  above  set  requires  (1  -  2 f(k))kQ  In m  subsets  in  order  to  be 
covered,  where  f(k )  ->•  0  as  k  ->■  oo. 
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Note  that  In m  —  (1  -  e)  In N  by  the  setting  of  n,  l,  and  m  in  the  proof.  Thus,  the  final 
hardness  ratio  is  (1  -  e)  In N,  where  N  =  \U\.  However,  we  can  choose  different  settings 
of  n,  l,  and  m  and  obtain  different  hardness  ratios. 

We  finish  the  present  of  Feige’s  reduction  by  giving  upper  bounds  for  quantities  that 
appear  later  in  our  proofs. 

•  The  number  of  subsets  |«S|  <  \Q\221.  Since,  for  each  question  q  e  Q,  there  are  at 
most  221  answers  of  21  bit  length. 

•  The  maximum  size  of  a  subset  As  =  max  I S\  <  m3l/2.  Since  each  i  and  q  e  Q 

SeS 

there  are  at  most  3l/2  random  strings  r  such  that  the  verifier  makes  query  q  to  the 
ith  prover  and  | prar  i\  <  m. 

•  The  maximum  frequency  of  a  point  (element)  in  U:  f  <  k2l.  Because,  for  a  pair 
(q,i),  each  partition  prari  is  included  at  most  2l  times,  plus  each  point  in  Br  appears 
in  exactly  k  partitions. 

6.3.2  One-hop  CFM 

We  prove  that  the  CFM  problem  cannot  be  approximated  within  a  factor  In  A  - 
0(lnln  A)  in  graphs  of  maximum  degree  A,  unless  P=NP.  The  proof  uses  a  gap-reduction 
from  an  instance  of  the  Bounded  Set  Cover  problem  (SCg)  to  an  instance  of  CFM 
problem  whose  degrees  are  bounded  by  B'  =  B  poly  log  B.  For  background  on 
hardness  of  approximation  and  gap-reduction  we  refer  to  reference  [8]. 

Definition  8  (Bounded  Set  Cover).  Given  a  set  system  (U,S),  where  U  =  {e1,e2, . . .  ,enJ 
is  a  universe  and  S  is  a  collection  of  subsets  ofU.  Each  subset  in  S  has  at  most  B  ele¬ 
ments  and  each  element  belongs  to  at  most  B  subsets,  for  a  predefined  constant  B  >  0. 
A  cover  is  a  subfamily  C  c  S  of  sets  whose  union  is  U.  Find  a  cover  which  uses  the 
minimum  number  of  subsets. 

We  state  the  tight  inapproximability  result  for  the  bounded  set  cover  by  Trevisan  [74] 
in  the  following  lemma. 

Lemma  18.  There  exist  constants  B0,  c0  >  0  such  that  for  every  B  >  B0  it  is  NP-hard  to 
approximate  the  SCB  problem  within  a  factor  of  In  B  -  c0  In  In  B. 


127 


Figure  6-2.  Reduction  from  SCB  to  CFM  when  d  —  1 


The  proof  in  [74]  reduces  an  instance  of  GAP  -  SATla  of  size  ns  to  an  instance 
T  =  (, U,S )  of  SCB  by  settings  parameters  l,m  in  Feige’s  construction  [38]  to  be 
6(\nlnB)  and  p0|y  f0g(B)»  respectively.  Denote  by  A5  the  maximum  cardinality  of  sets, 
and  by  /  the  maximum  frequency  of  elements  in  U,  we  have 

•  \U\  =  mnls  poly  \ogB,  \S\  =  nls  poly  log  B 

•  A s  <  B,f  <  poly  log  B  for  sufficient  large  B. 

SCb-CFM  reduction.  For  each  instance  T  =  (U,S)  of  SCB,  we  construct  a  graph 
H  =  ( V,E )  as  follows  (Fig.  6-2): 

•  Construct  a  bipartite  graph  with  the  vertex  set  U  u  S  and  edges  between  S  and  all 
elements  et  e  S,  for  each  S  e  S. 


Add  a  set  D  consisting  of  t  vertices  and  a  set  D'  with  same  number  of  vertices,  say 

M 

B  In2  B  ' 


D  =  {xi,  x2, . . . ,  xt}  and  D'  =  {Vj,  x'2, . . . ,  x't},  where  t  =  — 


•  Connect  xt  to  x',  Vi  =  1 . . .  t.  This  enforces  the  selection  of  xt  in  the  optimal  CFM. 

•  Connect  each  vertex  e3  gW  to  -  1  and  each  vertex  Sk  e  S  to  \j^\Sk\] 

vertices  in  D,  where  /(&,■)  is  the  frequency  of  element  er  During  the  connection, 
we  balance  the  degrees  of  vertices  in  D. 

We  can  assume  w.I.o.g.  that  optimal  solutions  of  CFM  contains  all  vertices  in  D 
but  not  ones  in  D’.  Then,  all  vertices  in  S  will  be  activated  after  the  first  round,  and 
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the  a  vertex  in  U  is  activated  if  and  only  if  one  of  its  neighbors  in  S  is  selected  into  the 
solution.  Thus,  the  following  lemma  holds. 

Lemma  19.  The  size  difference  between  the  optimal  CFM  ofTL  and  the  optimal  SCB  of 
T  is  exactly  the  cardinality  of  D,  i.e.,  OPTCFM(TL )  =  OPTsc(T)  + 1. 

The  key  to  preserve  the  hardness  ratio  is  to  keep  the  degree  of  vertices  in  Ti 
bounded  and  the  gap  between  the  optimal  solutions’  sizes  small. 

Lemma  20.  Iff  =  B  B ,  then  the  maximum  degree  of  vertices  in  TL  will  be  B'  =  A  (U)  = 
0(B  poly  log  B). 

Proof.  We  can  verify  that  vertices  in  S  and  U  have  degree  0(B).  Vertices  in  D  have 
degrees  at  most  +  1,  where  vol(D)  is  the  total  degree  of  vertices  in  D.  Define 
<f>(X,  Y)  as  the  set  of  edges  crossing  between  two  vertex  subsets  X  and  Y.  We  have 


vol(.D)  =  \<j>(D,D’)\  +  \<j>(D,U)\  +  \<I>(D,S)\ 


n>i  +  £rr^-is.ii  +  i;rTf-/fe>-ii 

Sk£S  ^  ej£U  ' 

<  -^\S\B  +  \S\+t=  ( ~^B  +  1  ]  \S\+t 


P 


1  ~P 


1  -p 


We  have  used  the  facts  that  ^  ^  f(ej )  and  1^1  <  B,  VSk  e  S. 

Sk£S  ej&A 

Thus, 


B'  <  - 


2  P 


< 


t  \\l-  p 
2  P 


B  +  1  )  |<S|  + 1  )  +  1 


B  +  1 


B  In2  B  n's  poly  log  B 


1-p  )  mnl  poly  log  B 

<  0(B  poly  log  B) 


(6-4) 


(6-5) 


This  completes  the  proof. 


□ 


Theorem  6.3.  When  d—  1,  it  is  NP-hard  to  approximate  the  CFM  problem  in  graphs  with 
degrees  bounded  by  B'  within  a  factor  of  In  B'  -  cx  In  In  B\  for  some  constant  ct  >  0. 
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Proof.  We  prove  by  contradiction.  Assume  there  exists  algorithm  A  to  find  in  graph  with 
degrees  bounded  by  B'  and  d  =  la  CFM  of  size  at  most  (In  B'  -  cx  lnlnB')OPTCFM, 
where  OPT Cfm  is  the  size  of  an  optimal  CFM.  Let  T  =  ( U ,  S )  be  an  instance  of  SCB 
with  the  optimal  solution  of  size  OPTsc  Construct  an  instance  PL  of  CFM  problem  using 
the  reduction  SCs-CFM  as  shown  above.  From  (6-5),  there  exists  constant  /3  >  0 
so  that  B’  <  B  h/  B.  Using  algorithm  A  on  PL,  we  obtain  a  solution  of  size  at  most 
(In  B'  -  cilnlnB^OPTcFM-  We  can  then  convert  that  to  a  solution  of  SCb  by  excluding 
vertices  in  D  (see  Lemma  19)  and  obtain  a  set  cover  of  size  at  most 

(Ini?7  —  ci  lnlni?,)(OPTsc  +  t)  —  t  (6—6) 

Since  each  set  in  S  can  cover  at  most  B  elements,  we  have  OPTSc  >  ^  =  tBl^B ,  thus 
t  <  °^Jsg  ■  If  we  select  ci  =  c0  +  /3  +  1,  the  solution  of  SCB  is  then,  after  some  algebra, 
at  most  (In  B  -  c0  In  In  £>) OPTsc  that  contradicts  the  Lemma  18.  □ 

Similarly,  with  appropriate  setting  in  Feige’s  construction  [38],  we  obtain  the 
following  hardness  result  regarding  the  network  size  n  (the  proof  detail  can  be  found 
in  the  technical  report  on  our  website). 

Theorem  6.4.  For  any  e  >  0,  the  CFM  problem,  when  d  =  1,  cannot  be  approximated 
within  a  factor  (|  -  e)  In  n,  unless  NP  c  D7/ME(n0(loglogri)). 

Proof.  We  use  the  same  gadget  in  Fig.  6-2  to  prove  the  hardness  for  CFM.  Since,  we 
no  longer  need  to  keep  degree  of  vertices  in  the  gadget  bounded,  we  form  a  clique  with 
vertices  in  D. 

We  can  connect  each  v  e  (SUU)  to  pv  vertices  in  D.  That  is 

\D\  =  0(  max  9{— —As))  =  0(AS)  =  0(m  3l/2) 

vE(SUU)nv  1  —  p 
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Or  equivalently 


\D\2  =  0{  £  d(v)+x0(\S\  +  \U\)  =  O(2Yd{v)  +  \S\  +  \U\)  =  O(mRk2l) 

v£(SUU)  v&A 

To  summarize,  the  sufficient  condition  is 

\D\  =  0{m2e (z)  +  (mRk2l)1/2).  (6-7) 


By  Lemma  17  and  the  construction,  the  hardness  ratios  of  our  problems  are  given 
by 

(1  —  | )kQ  In  m  +  \D\ 
kQ+  \D\  ' 

Unfortunately,  with  the  same  setting  in  the  Feige’s  reduction,  \D\  =  0(AS)  = 
0((5n)?2e(z)),  the  above  hardness  ratio  gets  arbitrary  close  to  1.  Hence  we  use  a 
different  setting  in  which  m  =  (5?i)d  with  a  small  constant  c  >  0  to  reduce  the  maximum 
degree.  The  consequence  is  that  the  inapproximability  ratio  is  reduced  accordingly. 

The  optimal  setting  to  get  the  best  inapproximability  ratio  is  to  set  m  =  (5n)*(1-e)  for 
some  e  >  0.  Then,  N  =  mR  =  (5 n)l{~2~e\  or  m  =  N^.  From  (6-7),  it  is  sufficient  that 

2*(0 

\D\  =  =  0(Q) 

n  2 


Hence,  the  hardness  ratio  will  be 

(1  -  f)kQ\nm  +  o(Q) 


>  (1  —  — )  In  m 
k 


kQ  +  o(Q) 

The  number  of  vertices  in  the  graph,  denoted  by  nn,  is 

/  l/2 

nn  =  2\D\  +  |«5|  +  \U\  <  9(jn3l/2)  +  nl221  +  (5  n)2l~e  <  2\U\  =  2N 

Finally,  the  hardness  ratio  is  at  least 

(1  “  l] ln  (t)  /2  “  >  (1  "  0  “  2^l)  lnn"  -  9(1)  >  ^(1  -  J)  'nn« ■ 


Here,  we  assume  k  is  sufficiently  large  and  e  is  sufficiently  small. 


□ 


□ 
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Note  that  Theorems  6.3  and  6.4  are  incomparable  in  general.  Let  A  be  the 
maximum  degree,  Theorem  6.3  implies  the  hardness  of  approximation  with  factor 
(1  -  e)  In  A,  which  is  larger  than  (|  -  n)  In n  if  A  ^  n,  but  smaller  when  A  <  yfn,  for 
example  in  power-law  graphs  with  the  exponent  7  >  2.  In  addition,  the  Theorem  6.4 
uses  a  stronger  assumption  than  that  in  Theorem  6.3. 

6.3.3  Multiple-hop  CFM 

We  now  present  a  gap  reduction  from  the  CFM  problem  to  the  one-hop  CFM 
problem  with  d  >  2.  The  hardness  result  follows  immediately  by  the  Theorem  6.3  in  the 
previous  section. 

Given  a  graph  G  =  ( V. ,  E)  as  an  instance  of  the  CFM  problem.  We  will  construct  an 
instance  G'  =  (' V E')  of  the  CFM  problem  as  follows  (and  as  illustrated  in  Fig.  6-4).  We 


Figure  6-3.  The  transmitter  gadget. 

add  c{p)  vertices  wuw2, . . .  ,wc^,  called  flashpoints,  where  c(p)  =  min{t  eN|^|  <p< 
Aj}.  These  vertices  will  be  selected  at  the  beginning  to  kick  off  the  activation  of  other 
nodes.  Furthermore,  each  “flashpoint”  wp  is  connected  to  a  dummy  vertex  zp. 

Replace  each  edge  (u,v)  e  E  by  a  gadget  called  transmitter.  The  transmitter 
connecting  vertex  u  and  v  is  a  chain  of  d  -  1  path,  named  uv  1  to  uvd-i.  The  vertex  u 
is  connected  to  uv  1,  uv  1  is  connected  to  uv2  and  so  on,  vertex  uvd_  1  is  connected  to  v. 
Each  vertex  uvu  %  =  1  ..d  -  1  is  connected  to  all  flashpoints.  An  example  for  transmitter 
is  shown  in  Fig.  6-3.  The  transmitter  is  designed  so  that  if  all  flashpoints  and  vertex  u 
are  selected  at  the  beginning,  then  vertex  uvd_  1  will  be  activated  after  d  -  1  rounds. 
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d. 


Figure  6-4.  Gap-reduction  from  one-hop  CFM  to  d- hop  CFM. 

Hence,  the  number  of  activated  neighbors  of  v  after  d  -  1  rounds  will  equal  the  number 
of  selected  neighbors  of  v  in  the  original  graph. 

Finally,  we  replace  each  edge  (wp,  zp)  by  a  transmitter.  In  order  to  activate  all 
dummy  vertices  zp  after  d  rounds,  we  can  assume,  w.I.o.g.,  that  all  flashpoints  must 
be  selected  in  an  optimal  solution.  The  following  lemma  follows  directly  from  the 
construction. 

Lemma  21 .  Every  solution  of  size  k  for  the  one-hop  (d  =  1)  CFM  problem  in  G  induces 
a  solution  of  size  k  +  c(p)  for  the  d-hop  CFM  problem  in  G'. 

On  another  direction,  we  also  have  the  following  lemma. 

Lemma  22.  An  optimal  solution  of  size  k!  for  the  d-hop  CFM  problem  induces  a  size 
k'  -  c(p )  solution  for  the  one-hop  CFM  problem  in  G. 

Proof.  For  a  transmitter  connecting  u  to  v,  if  the  solution  of  the  d-hop  CFM  problem 
contains  any  of  the  intermediate  vertices  uv u . . .  ,uvd- 1,  we  can  replace  that  vertex  in 
the  solution  with  either  u  or  v  to  obtain  a  new  solution  of  same  size  (or  less).  Hence,  we 
can  assume,  w.I.o.g.,  that  none  of  the  intermediate  vertices  are  selected.  Therefore,  all 
flashpoints  must  be  selected  in  order  to  activate  the  dummy  vertices.  It  is  easy  to  see 
that  the  solution  of  d-hop  CFM  excluding  the  flashpoints  will  be  a  solution  of  one-hop 
CFM  in  G  with  size /c' -  c(p).  □ 
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Note  that  the  number  of  vertices  in  G'  is  upper-bounded  by  dn2  i.e.  In \V'\  < 

2ln\V\  +  Ind.  Thus,  using  the  same  arguments  used  in  the  proof  of  Theorem  6.4,  we 
can  show  that  a  (|  -  e)  Inn  approximation  algorithm  algorithm  lead  to  a  (l  -  e)lnn 
approximation  algorithm  for  the  one-hop  CFM  problem  (contradicts  Theorem  6.4). 
Theorem  6.5.  The  CFM  problem  cannot  be  approximated  within  (|  -  e)  log  n  ford  >  1, 
unless  NP  c  DT//WE(n°(loglogTl)) 

6.4  Empirical  Study 

In  this  section  we  perform  experiments  on  OSNs  to  show  the  efficiency  of  our 
algorithms  in  comparison  with  simple  degree  centrality  heuristic  and  study  the  trade-off 
between  the  number  of  times  the  information  is  allowed  to  propagate  in  the  network  and 
the  seeding  size. 

6.4.1  Comparing  to  Optimal  Seeding 

One  advantage  of  our  discrete  diffusion  model  over  probabilistic  ones  [49,  50]  is 
that  the  exact  solution  can  be  found  using  mathematical  programming.  This  enables  us 
to  study  the  exact  behavior  of  the  seeding  size  when  the  number  of  propagation  hop 
varies. 


A  p  =  0.4  B  p  =  0.6  C  p  =  0.8 

Figure  6-5.  Seeding  size  (in  percent)  on  Erdos’s  Collaboration  network.  VirAds  produces 
close  to  the  optimal  seeding  in  only  fractions  of  a  second  (in  comparison  to  2 
days  running  time  of  the  IP(optimal) ) 
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We  formulate  the  CFM  problem  as  an  0  -  1  Integer  Linear  Programming  (ILP) 


problem  below. 


minimize 

(6-8) 

vev 

subject  to  >  % 

(6-9) 

v&V 

xw 1  +  \p  ■  d(v)  K  1  >  \p  ■  rfO)l  A 

w£N(v) 

\/v  eV,i  =  l..d 

(6-10) 

T*  >  1 

xv  —  xv 

Wv  e  V,  i  —  l..d 

(6-11) 

<6  {0,1} 

Vv  E  V,i  =  0 ..d 

(6-12) 

10  if  v  is  inactive  at  round  i 

. 

1  otherwise 

The  objective  of  the  ILP  is  to  select  a  minimum  number  of  seeds  at  the  beginning. 
The  constraint  (2)  guarantees  all  nodes  are  activated  at  the  end,  while  (3)  deals  with 
propagation  condition;  the  constraint  (4)  is  simply  to  keep  vertices  active  once  they  are 
activated. 

We  solve  the  ILP  problem  on  Erdos  collaboration  networks,  the  social  network  of 
famous  mathematician,  [13].  The  network  consists  of  6100  vertices  and  15030  edges. 
The  ILP  is  solved  with  the  optimization  package  GUROBI  4.5  on  Intel  Xeon  2.93  Ghz 
PC  and  setting  the  time  limit  for  the  solver  to  be  2  days.  The  running  time  of  the  IP 
solver  increases  significantly  when  d  increases.  For  d  =  1, 2,  and  3,  the  solver  return  the 
optimal  solutions.  However,  for  d  =  4,  the  solver  cannot  find  the  optimal  solutions  within 
the  time  limit  and  returns  sub-optimal  solutions  with  relative  errors  at  most  15%. 

The  optimal  (or  sub-optimal)  seeding  sizes  are  shown  in  Figs.  6-5A,  6-5B,  and  6-5C 
for  p  =  0.4, 0.6  and  0.8,  respectively.  VirAds  provides  close-to-optimal  solutions  and 
performs  much  better  Max  Degree.  Especially,  when  p  =  0.8  the  VirAds’s  seeding  is  only 
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different  with  the  optimal  solutions  by  one  or  two  nodes.  In  addition,  VirAds  only  takes 
fractions  of  a  second  to  generate  the  solutions. 

As  proven  in  Section  6.1 ,  the  seeding  takes  a  constant  fraction  of  nodes  in  the 
network.  For  Erdos  Colloboration  Network,  the  seeding  consists  of  3.8%  to  7%  the 
number  of  nodes  in  the  networks.  Further,  the  seeding  can  consist  as  high  as  20%  to 
40%  nodes  in  the  network  for  larger  social  networks  in  next  section. 

Although  the  mathematical  approach  can  provide  accurate  measurement  on 
the  optimal  seeding  size,  it  cannot  be  applied  for  larger  networks.  The  rest  of  our 
experiments  measures  the  quality  and  scalability  of  our  proposed  algorithm  VirAds  on  a 
collection  of  large  networks. 
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Figure  6-6.  Seeding  size  when  the  number  of  propagation  hop  d  varies  (p  =  0.3).  VirAds 
consistently  has  the  best  performance. 


Number  of  Rounds(d)  Number  of  Rounds(d)  Number  of  Rounds(d) 


A  Physics 


B  Facebook 


C  Orkut 


Figure  6-7.  Running  time  when  the  number  of  propagation  hop  d  varies  (p  =  0.3).  Even 
for  the  largest  network  of  1 1 0  million  edges,  VirAds  takes  less  than  1 2 
minutes. 
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Figure  6-8.  Degree  distribution  of  studied  networks 
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6.4.2  Large  Social  Networks 

We  select  networks  of  various  sizes  including  Coauthors  network  in  Physics 
sections  of  the  e-print  arXiv  [49],  Facebook  [76]  and  Orkut  [59],  a  social  networking  run 
by  Google.  Links  in  all  three  networks  are  undirected  and  unweighted.  The  sizes  of  the 
networks  are  presented  in  Table  6-1 .  The  degree  distributions  of  those  networks  are 
shown  in  Fig.  6-8. 

Table  6-1 .  Sizes  of  the  investigated  networks 


Physics 

Facebook 

Orkut 

Vertices 

37,154 

90,269 

3,072,441 

Edges 

231,584 

3,646,662 

223,534,301 

Avg.  Degree 

12.5 

80.8 

145.5 

Physics :  We  shall  refer  the  physics  coauthors  network  as  Physics  network  or  simply 
Physics.  Each  node  in  the  network  represents  an  author  and  there  is  an  edge  between 
two  authors  if  they  coauthor  one  or  more  papers.  Facebook  dataset  consists  52%  of  the 
users  in  the  New  Orleans  [76].  Orkut  dataset  is  collected  by  performing  crawling  in  last 
2006  [59].  It  contains  about  1 1 .3%  of  Orkut’s  users. 

6.4.3  Solution  Quality  in  Large  Social  Networks 

We  compare  our  VirAds  algorithm  with  the  following  heuristics  Random  method 
in  which  vertices  are  picked  up  randomly  until  forming  a  ^-seeding  and  Max  Degree 
method  in  which  vertices  with  highest  degree  are  selected  until  forming  a  d-hop  seeding. 
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influence  factor  p 


Random 

Max  Degree  ■■■■&■■■ 
Exhaustive  Update  + 
VirAds  — B— 


Influence  factor  p 


A  Physics 


B  Facebook 


C  Orkut 


Figure  6-9.  Seeding  size  at  different  influence  factors  p  (the  maximum  number  of 
propagation  hops  is  d  =  4). 


Finally,  we  compare  VirAds  with  its  naive  implementation,  called  Exhaustive  Update, 
in  which  after  selecting  a  vertex  into  the  seeding,  the  effectiveness  of  all  the  remaining 
vertices  are  recalculated.  With  more  accurate  estimation  on  vertex  effectiveness, 
Exhaustive  Search  is  expected  to  produce  higher  quality  solutions  than  those  of  VirAds. 

The  seeding  size  with  different  number  of  propagation  hop  d  when  p  =  0.3  are 
shown  in  Fig.  6-6.  To  our  surprise,  VirAds  even  performs  equal  or  better  than  Exhaustive 
Update  despite  that  it  uses  significantly  less  effort  to  update  vertex  effectiveness.  VirAds 
has  smaller  seeding  in  Physics  than  Exhaustive  Update ;  both  of  them  give  similar  results 
for  Faceboook;  while  Exhaustive  Update  cannot  finish  on  Orkut  after  48  hours  and 
was  forced  to  terminate.  Sparingly  update  the  vertices’  effectiveness  turns  out  to  be 
efficient  enough  since  the  influence  propagation  is  locally  bounded.  In  addition,  the 
seeds  produced  by  VirAds  are  almost  two  times  smaller  than  those  of  Random. 

The  gap  between  VirAds  and  Max  Degree  is  narrowed  when  the  number  of 
maximum  hops  increases.  Hence,  selecting  nodes  with  high  degrees  as  seeding  is 
a  good  long-term  strategy,  but  might  not  be  efficient  for  fast  propagation  when  the 
number  of  hops  is  limited.  In  Facebook  and  Orkut,  when  d  —  1,  Max  Degree  has  60%  to 
70%  more  vertices  in  the  seeding  than  VirAds.  In  Physics,  the  gap  between  VirAds  and 
the  Max  Degree  is  less  impressive.  Nevertheless,  VirAds  consistently  produces  the  best 
solutions  in  all  networks. 
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6.4.4  Scalability 

The  running  time  of  all  methods  at  different  propagation  hop  d  are  presented  in  Fig 
6-7.  The  time  is  measured  in  second  and  presented  in  the  log  scale.  The  running  times 
increase  slightly  together  with  the  number  of  propagation  rounds  d,  and  are  proportional 
to  the  size  of  the  network.  The  Exhaustive  Update  has  the  worst  running  time,  taking 
up  to  15  minutes  for  Physics,  20  minutes  for  Facebook.  For  Orkut,  the  algorithm  cannot 
finish  within  2  days,  as  mentioned.  The  three  remaining  algorithms  VirAds,  Max  Degree, 
and  Random  take  less  than  one  second  for  Physics,  and  less  than  10  seconds  for 
Facebook.  Even  on  the  largest  network  Orkut  with  more  than  220  million  edges,  VirAds 
requires  less  than  1 2  minutes  to  complete. 

6.4.5  Influence  factor 

We  study  the  performance  of  VirAds  and  the  other  method  at  different  influence 
factor  p.  The  number  of  propagation  rounds  d  is  fixed  to  4.  The  size  of  rf-seeding  sets 
are  shown  in  Figures  6-9.  VirAds  is  clearly  still  the  best  performer.  The  seeding  sizes  of 
VirAds  are  up  to  5  times  smaller  than  those  of  Max  Degree  for  small  p  (although  it’s  hard 
to  see  this  on  the  charts  due  to  small  seeding  sizes). 

Since  all  tested  networks  are  social  networks  with  small  diameter,  the  seeding  sizes 
go  to  zero  when  p  is  close  to  zero.  The  exception  is  the  Physics,  in  which  the  seeding 
sizes  do  not  go  below  10%  the  number  of  vertices  in  the  networks  even  when  p  =  0.05. 

A  closer  look  into  the  Physics  network  reveals  that  the  network  contain  many  isolated 
cliques  of  small  sizes  (2,  3,  4,  and  so  on)  which  correspond  to  authors  that  appear  in 
only  one  paper.  In  each  clique,  regardless  of  the  threshold  p,  at  least  one  vertex  must  be 
selected,  thus  the  seeding  size  cannot  get  below  the  number  of  isolated  cliques  in  the 
networks.  To  eliminate  the  effect  of  isolated  cliques,  a  possible  approach  is  to  restrict  the 
problem  to  the  largest  component  in  the  network. 
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CHAPTER  7 
CONCLUSION 

Society  relies  heavily  on  its  networked  physical  infrastructure  and  information 
systems.  To  detect  vulnerability  issues  in  a  network,  it  is  of  particular  importance  to 
analyze  how  well-connected  the  network  will  remain  after  a  disruptive  event  takes 
place.  We  propose  the  use  of  pairwise  connectivity,  the  number  of  connected  pairs 
in  the  network,  as  a  disruptive  effect  measurement,  and  use  it  to  formulate  network 
vulnerability  assessment  as  optimization  problems.  The  objective  is  to  identify  the 
minimum  set  of  critical  network  elements  (nodes  or  edges)  whose  removal  results  in  a 
major  degradation  of  the  network  pairwise  connectivity. 

We  prove  that  both  critical  edges  detection  (CED)  and  critical  nodes  detection 
(CND)  are  NP-complete  [33];  and  develop  two  novel  solutions  with  provable  guarantees: 
1)  an  0(logL5  n)  bicriteria  approximation  algorithm  for  CED  based  on  constructing  a 
decomposition  tree  with  recursive  c-balanced  cut  and  2)  an  O (log  n  log  log  n)  bicriteria 
approximation  algorithm  for  CND  [31].  Later  we  design  a  bicriteria  approximation 
algorithm  with  performance  guarantee  0(^/\ogn)  when  the  set  of  critical  elements  may 
include  both  edges  and  nodes.  This  immediately  implies  improved  results  for  both 
CED  and  CND.  The  extensive  experiments  have  revealed  many  insights  on  the  relative 
criticality  between  edges  and  nodes  in  the  networks  on  different  network  topologies. 

dynamic  networks,  e.g.  cellular  networks,  or  mobile  sensor  networks,  detecting 
critical  nodes  is  extremely  challenging  due  to  the  continual  changes  in  network  topology. 
We  abstract  dynamic  networks  as  probabilistic  graphs  and  measure  the  disruptive 
effect  in  terms  of  expected  pairwise  connectivity  (EPC).  Computing  EPC  is  tightly 
related  to  network  reliability  problems,  some  of  the  most  classical  open  #P-complete 
problems.  Beyond  showing  #P-completeness  of  EPC,  we  have  approximated  EPC 
with  an  FPRAS,  which  gives  a  potential  direction  to  tackle  open  questions  in  network 
reliability.  Further,  we  formulate  the  problem  of  detecting  critical  nodes  as  a  two-level 
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stochastic  programming  and  present  a  sample  average  approximation  algorithm  to  solve 
the  formulation  with  guaranteed  accuracy. 

We  investigate  in  Chapter  6  cascading  failures  in  complex  systems.  Those  failures 
often  propagate  and  lead  to  a  much  more  devastating  consequence.  Thus,  it  is  crucial 
to  detect  critical  nodes  whose  failures  will  trigger  a  cascading  failure  to  an  entire 
network,  leaving  major  nodes  in  the  failure  state  within  a  given  number  of  steps.  My 
theoretical  analysis  shows  that  the  cascading  of  failures  maybe  quite  different  in  power- 
law  networks  than  others.  First,  we  prove  that  a  large  number  of  initial  failures  are 
required  to  trigger  a  network-wide  failure.  Second,  the  problem  of  detecting  critical 
nodes  cannot  be  approximated  within  a  factor  O(logn)  in  general  graphs,  however,  there 
is  a  constant  factor  approximation  algorithm  for  the  problem  in  power-law  networks. 
Extensive  experiments  on  large-scale  OSNs  up  to  hundreds  of  millions  of  edges 
demonstrate  the  effectiveness  of  my  proposed  algorithm.  My  study  is  also  applied 
naturally  to  the  problems  of  information  propagation,  viral  marketing,  and  disease 
spreading. 
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Many  practical  complex  networks,  such  as  the  Internet,  WWW  and  social  networks, 
are  discovered  to  follow  power-law  distribution  in  their  degree  sequences,  i.e.,  the 
number  of  nodes  with  degree  /  in  these  networks  is  proportional  to  /-/3  for  some 
exponential  factor  f3  >  1.  The  exploitation  of  such  networks  becomes  an  urgent 
need,  yet  remains  open  especially  from  theoretical  viewpoints. 

In  this  dissertation,  we  first  investigate  if  it  is  easier  to  solve  many  optimization 
problems  in  power-law  networks.  Our  works  focus  on  the  hardness  and  inapproximability 
of  optimization  problems  on  power-law  graphs  (PLG).  Particularly,  we  show  that  the 
Minimum  Dominating  Set,  Minimum  Vertex  Cover  and  Maximum  Independent 
Set  are  still  APX-hard  on  power-law  graphs.  We  further  show  the  inapproximability 
factors  of  these  optimization  problems  and  a  more  general  problem  (p-Minimum 
Dominating  Set),  which  proved  that  a  belief  of  (1  +  o(l))-approximation  algorithm 
for  these  problems  on  power-law  graphs  is  not  always  true.  In  order  to  show  the 
above  theoretical  results,  we  propose  a  general  cycle-based  embedding  technique 
to  embed  any  (Abounded  graphs  into  a  power-law  graph.  In  addition,  we  present  a 
brief  description  of  the  relationship  between  the  exponential  factor  (3  and  constant 
greedy  approximation  algorithms.  Moreover,  we  propose  a  algorithm  framework,  called 
Low-Degree  Percolation  (LDP)  Algorithm  Framework,  for  solving  Minimum  Dominating 
Set,  Minimum  Vertex  Cover  and  Maximum  Independent  Set  problems  in  power-law 
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graphs.  Using  this  framework,  we  further  show  a  theoretical  framework  to  derive 
the  approximation  ratios  for  these  optimization  problems  in  two  well-known  random 
power-law  graphs.  Numerical  analysis  shows  that  our  proposed  framework  can  not  only 
lead  to  a  good  theoretical  approximation  ratio  but  also  result  in  even  better  performance 
than  theoretical  bounds. 

In  addition,  the  robustness  of  power-law  networks  attracts  more  research  attentions 
since  they  are  exposed  to  a  great  number  of  threats  such  as  adversarial  attacks  on  the 
Internet,  cybercrimes  on  the  WWW  or  malware  propagations  on  social  networks.  In  this 
dissertation,  we  first  show  it  NP-hard  to  detect  critical  links  and  nodes  even  in  power-law 
networks.  Due  to  the  denial  of  promptly  assessing  vulnerability  of  power-law  networks 
in  this  manner,  we  are  more  interested  in  the  vulnerability  of  power-law  networks  under 
random  attacks  and  adversarial  attacks  using  the  in-depth  probabilistic  analysis  on  the 
theory  of  random  power-law  graph  models.  Our  results  indicate  that  power-law  networks 
are  able  to  tolerate  random  failures  if  their  exponential  factor  (3  is  less  than  2.9,  and  they 
are  more  robust  against  intentional  attacks  if  (3  is  smaller.  In  the  present  of  cascading 
failure,  we  show  that  power-law  networks  are  very  vulnerable  when  cascading  failure 
occurs  since  any  random  failures  of  high  degree  nodes  can  easily  overload  the  low 
degree  nodes. 

At  last,  we  study  the  optimization  of  power-law  networks,  from  design  and  protection 
perspectives.  On  the  one  hand,  we  reveal  the  best  range  [1.8, 2.5]  for  the  exponential 
factor  (3  by  optimizing  the  complex  networks  in  terms  of  both  their  vulnerabilities  and 
costs.  When  f3  <  1.8,  the  network  maintenance  cost  is  very  expensive,  and  when 
(3  >  2.5  the  network  robustness  is  unpredictable  since  it  depends  on  the  specific 
attacking  strategy.  On  the  other  hand,  we  study  Critical  Link  Disruptor  (CLD)  and 
Critical  Node  Disruptor  (CND)  optimization  problems  to  identify  critical  links  and 
nodes  in  a  network  whose  removals  maximally  destroy  the  network’s  functions.  After 
showing  the  NP-hardness  of  these  two  problems,  we  propose  HILPR,  a  novel  LP-based 
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rounding  algorithm,  for  efficiently  solving  CLD  and  CND  problems  in  a  timely  manner. 

In  the  case  of  cascading  failures,  we  further  develop  the  TRPA  algorithm,  an  iterative 
2-phase  algorithm,  for  solving  Cascading  Critical  Node  Disruptor  (CCND)  problem.  The 
effectiveness  of  our  solutions  is  validated  on  various  synthetic  and  real-world  networks. 
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CHAPTER  1 
INTRODUCTION 

One  of  the  most  remarkable  discoveries  in  many  real-world  networks  is  the 
power-law  distribution  in  their  degree  sequences,  ranging  from  the  Internet  [34],  WWW 
[4],  biological  networks  [12]  to  social  networks  [74].  In  particular,  the  number  of  nodes 
with  degree  /  in  these  complex  networks  is  observed  to  be  proportional  to  r0  for  some 
exponential  factor  (3  >  1. 

1.1  Power-Law  Graphs 

1.1.1  Formal  Definition 

We  consider  the  following  graph,  (a,  /3)  graph  G(Qi/3),  with  its  power-law  degree 
distribution  depending  on  two  given  values  a  and  f3. 

Definition  1  ({a,  f3)  Graph  G^0)).  Given  an  undirected  graph  G  =  (V,  E)  having  |\/|  =  n 
nodes  and  \E\  =  m  edges,  it  is  called  a  (a,  (3)  power-law  graph  if  its  maximum  degree  is 
A  =  [ea//3J  and  the  number  of  nodes  with  degree  /  is 

(  LfJ,  /// >1  or  £f=1L£J  is  even 
y/=<  O-i) 

[e"J  +  1,  otherwise 

Note  that  the  number  of  nodes  n  =  ea((/3)  +  0(n?  -  1)  and  the  number  of  edges 
m  =  §e“C(/3  -  1)  +  0(n?  -  1),  where  (((3)  =  w  /s  the  Riemann  Zeta  function. 

For  simplicity,  since  there  is  only  a  very  small  error  o(l)  when  (3  >  2  when  counting 
the  number  of  both  nodes  and  edges,  we  denote  them  as  n  =  ea(((3 )  and  edges 

m  =  |eaC(/S  -  1). 

1.1.2  Random  Power-Law  Graph  Model 

There  are  two  main  categories  of  random  graph  models  to  generate  graphs 
with  skewed  degree  sequences,  evolutionary  and  structural.  Evolutionary  models 
lead  to  the  skewed  degree  distributions  by  identifying  growth  primitives,  including 
multi-objective  optimization  [6,  33]  and  statistical  preferential  attachment  [11, 24,  58,  65]. 
Despite  its  advantage  to  explore  additional  network  semantics,  the  tight  dependencies 
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between  iterations  in  evolutionary  models  bring  the  biggest  obstacle  in  the  probabilistic 
analysis  [15,  33].  Structural  models,  on  the  other  hand,  start  with  a  given  skewed 
degree  distribution  (e.g.,  a  power-law  distribution  based  on  the  degree  sequences  of 
a  real-world  network  [18])  and  generate  a  graph  with  the  degree  sequence,  satisfying 
certain  randomness  properties  [2,  37,  82],  The  greatest  advantage  of  such  structural 
models  is  their  tractability  to  theoretical  analysis,  due  to  its  discard  of  dependencies  in 
evolutionary  models  by  taking  skewed  degree  sequences  [2,  21, 66].  Although  the  term 
configuration  is  used,  a  lot  of  mathematicians  also  noted  this  advantage  by  exploiting 
several  properties  in  structural  random  graph  models  [14,  68,  69]. 

Therefore,  in  this  dissertation,  we  use  the  well-accepted  structural  PLRG  model  in 
[2]  in  order  to  explore  the  power-law  networks  from  an  in-depth  theoretical  perspective. 
Given  the  parameters  a  and  /3,  the  PLRG  model  is  proposed  as  an  structural  approach 
to  construct  a  (a,  /3 )  power-law  graph  according  to  its  degree  sequence  d,  which 

consists  of  a  sequence  of  integers  (1 _ 1,2 _ 2 _ A)  where  the  number  of  /  is  equal 

to  y,  defined  in  the  above  Definition  1. 

Definition  2  (Power-Law  Random  Graph  (PLRG)  Model).  Given  d  —  (dlt  d2, ... ,  dn ) 

be  a  sequence  of  integers  (1 _ 1,2 _ 2 _ A)  where  the  number  of  i  is  equal  to 

Yi,  the  PLRG  model  generates  a  random  graph  as  follows.  Consider  D  =  J2,=i 
mini-nodes  lying  in  n  clusters  of  each  size  d,  where  1  <  i  <  n,  we  construct  a  random 
perfect  matching  among  the  mini-nodes  and  generate  a  graph  on  the  n  original  nodes  as 
suggested  by  this  perfect  matching  in  the  natural  way:  two  original  nodes  are  connected 
by  an  edge  if  and  only  if  at  least  one  edge  in  the  random  perfect  matching  connects  the 
mini-nodes  of  their  corresponding  clusters. 

1.2  Optimization  Problems  in  Power-Law  Graphs 

A  great  number  of  large-scale  networks  in  real  life  are  discovered  to  follow  a 
power-law  distribution  in  their  degree  sequences,  ranging  from  the  Internet  [34],  the 
World-Wide  Web  (WWW)  [4]  to  social  networks  [74],  That  is,  the  number  of  vertices 
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with  degree  /  is  proportional  to  /-/3  for  some  constant  (3  in  these  graphs,  which  is  called 
power-law  graphs.  The  observations  show  that  the  exponential  factor  /3  ranges  between 
1  and  4  for  most  real-world  networks  [18].  Intuitively,  the  following  theoretical  question  is 
raised:  What  are  the  differences  in  terms  of  complexity  hardness  and  inapproximability 
factor  of  several  optimization  problems  between  in  general  graphs  and  in  power-law 
graphs? 

Many  experimental  results  on  random  power-law  graphs  give  us  a  belief  that 
the  problems  might  be  much  easier  to  solve  on  power-law  graphs.  Eubank  et  al.  [32] 
showed  that  a  simple  greedy  algorithm  leads  to  a  1  +  o(l)  approximation  factor  on 
Minimum  Dominating  Set  (MDS)  and  Minimum  Vertex  Cover  (MVC)  on  power-law 
graphs  (without  any  formal  proof)  although  MDS  and  MVC  has  been  proved  /VP-hard 
to  be  approximated  within  (1  -  e)  log  n  and  1.366  on  general  graphs  respectively 
[28].  In  [73],  Gopal  also  claimed  that  there  exists  a  polynomial  time  algorithm  that 
guarantees  a  1  +  o(l)  approximation  of  the  MVC  problem  with  probability  at  least 
1  -  o(l).  Unfortunately,  there  is  no  such  formal  proof  for  this  claim  either.  Furthermore, 
several  papers  also  have  some  theoretical  guarantees  for  some  problems  on  power-law 
graphs.  Gkantsidis  et  al.  [36]  proved  the  flow  through  each  link  is  at  most  0(n  log2  n ) 
on  power-law  random  graphs  where  the  routing  of  0(dudv )  units  of  flow  between  each 
pair  of  vertices  u  and  v  with  degrees  du  and  dv.  In  [36],  the  authors  take  advantage 
of  the  property  of  power-law  distribution  by  using  the  structural  random  model  [2]  and 
show  the  theoretical  upper  bound  with  high  probability  1  -  o(l)  and  the  corresponding 
experimental  results.  Likewise,  Janson  et  al.  [48]  gave  an  algorithm  that  approximated 
Maximum  Clique  within  1  -  o(l)  on  power-law  graphs  with  high  probability  on  the 
random  poisson  model  G(n,  a)  (i.e.  the  number  of  vertices  with  degree  at  least 
I  decreases  roughly  as  n~').  Although  these  results  were  based  on  experiments 
and  various  random  models,  they  raise  an  interest  in  investigating  hardness  and 
inapproximability  of  optimization  problems  on  power-law  graphs. 
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Recently,  Ferrante  et  at.  [35]  had  an  initial  attempt  on  power-law  graphs  to  show 
the  A/P-hardness  of  Maximum  Clique  (Clique)  and  Minimum  Graph  Coloring 
(Coloring)  (/3  >  1)  by  constructing  a  bipartite  graph  to  embed  a  general  graph  into 
a  power-law  graph  and  A/P-hardness  of  MVC,  MDS  and  Maximum  Independent  Set 
(MIS)  (0  >  0)  based  on  their  optimal  substructure  properties. 

1 .3  Vulnerability  Assessment  of  Power-Law  Networks 

Most  studies  investigating  this  power-law  property  have  been  focused  on  how 
such  degree  heterogeneity  nature  can  impact  the  robustness  of  networks  [3,  5,  43],  or 
how  one  can  quickly  and  efficiently  generate  an  ideal  power-law  network  with  a  given 
degree  sequence  [2,  14].  Focusing  on  the  security  factor,  the  works  [5,  23,  43,  72]  have 
empirically  shown  that  power-law  networks  appear  robust  under  random  attacks  and 
vulnerable  to  intentional  attacks  via  experimental  observations.  Nevertheless,  there  are 
several  important  security  aspects  of  this  property  that  are  left  untouched.  For  instance, 
are  power-law  networks  surely  more  vulnerable  to  intentional  attacks  than  random 
failures?  How  can  we  accurately  assess  the  robustness  of  power-law  networks  under 
various  kinds  of  threat,  e.g.,  random  failure  and  adversarial  attack?  Can  we  design  more 
stable  and  robust  power-law  networks  by  adjusting  the  parameter  /?? 

Another  limitation  of  these  prior  works  is  their  heavy  dependence  on  the  experiments 
and  failures  to  optimize  the  power-law  networks.  In  other  words,  we  cannot  apply  them 
to  enhance  the  robustness  of  power-law  networks,  and  in  the  meanwhile  reduce  their 
costs.  To  our  best  knowledge,  this  work  is  the  first  attempt  from  a  theoretical  point 
of  view  targeting  in  the  two  objectives  mentioned  above:  (1)  assessing  the  impact 
of  random  and  intentional  attacks  on  power-law  networks;  (2)  optimizing  power-law 
networks  based  on  their  toleration  on  threats  and  maintenance  costs,  which  are  used  to 
guarantee  the  network  functionality  and  reliability. 
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1.4  Optimization  of  Power-Law  Networks 

Although  power-law  networks  are  more  robust  when  /3  is  smaller,  a  majority  of 
real-world  networks  usually  have  their  exponential  factor  /3  ranging  from  2  to  2.5  rather 
than  some  small  /3  approaching  1  or  even  less.  The  questions  are  intuitively  raised:  Is 
it  better  if  real-world  networks  are  denser  such  that  they  can  be  more  robust?  What 
causes  them  to  be  sparser  than  our  expectation?  Does  there  exist  some  potential 
optimization  factors? 

On  the  other  hand,  in  order  to  optimally  maintain  the  power-law  networks,  it  is  of 
great  importance  to  assess  the  network  vulnerability,  that  is,  to  study  how  much  the 
network  performance  reduces  in  various  cases  of  undesired  disruptions,  such  as  natural 
disasters,  unexpected  elements  failures,  or  especially  adversarial  attacks.  In  a  typical 
attacking  point  of  view,  an  attacker  would  first  exploit  the  network  weaknesses,  and  then 
only  needs  to  target  on  some  critical  links  or  nodes  whose  corruptions  bring  the  whole 
network  down  to  its  knees.  For  instance,  an  adversarial  attack  to  any  essential  Internet 
providers,  e.g.,  tier-1  ISPs  such  as  Qwest,  AT&T  or  Sprint  servers,  once  successful,  may 
cause  tremendous  breakdowns  to  millions  of  companies’  websites  and  online  services. 

In  a  natural  disaster,  an  unexpected  earthquake  may  destroy  some  important  power 
lines,  and  consequently  lead  to  a  large-area  blackout.  Therefore,  it  is  crucial  to  explore 
the  network  vulnerability,  i.e.,  identify  those  crucial  links  and  nodes,  beforehand. 

1.5  Outline  of  Dissertation 

The  rest  of  dissertation,  focusing  on  addressing  the  above  three  topics,  is  organized 
as  follows:  Chapter  2  presents  the  hardness  and  inapproximability  results  of  classic 
optimization  problem  in  power-law  networks,  in  which  we  propose  two  novel  techniques 
to  embed  a  c/-bounded  graph  into  general  power-law  graphs  and  simple  power-law 
graphs  respectively.  In  addition,  we  design  a  Low-Degree  Percolation  (LDP)  Algorithm 
Framework  for  these  optimization  problems,  and  further  provide  a  theoretical  framework 
to  analyze  approximation  ratios  in  power-law  graphs.  In  Chapter  3,  we  explore  the 
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vulnerability  of  power-law  networks  via  in-depth  probabilistic  analysis,  under  random 
failures,  intentional  attacks,  and  random  cascading  failures.  Chapter  4  investigates 
the  optimization  of  power-law  networks.  From  a  design  perspective,  we  show  that,  in 
both  communication  and  social  context,  the  power-law  networks  with  exponential  factor 
between  1 .8  and  2.5  results  in  the  optimal  design.  Furthermore,  in  order  to  better  protect 
the  power-law  networks,  we  study  CLD,  CND  and  CCND  problems  to  detect  critical 
elements.  After  showing  the  NP-hardness  of  these  problems,  we  develop  HILPR  and 
TRGA  algorithms  to  solve  them  in  a  timely  manner.  The  whole  dissertation  is  concluded 
in  Chapter  5. 
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CHAPTER  2 

HARDNESS  AND  APPROXIMATION  ALGORITHMS 
In  this  chapter,  we  develop  two  new  techniques  on  optimal  substructure  problems, 
Cycle-Based  Embedding  Technique  and  Graphic  Embedding  Technique,  to  embed 
a  d- bounded  graph  into  a  general  power-law  graph  and  a  simple  power-law  graph 
respectively.  Then  we  use  these  two  techniques  to  further  prove  the  APX-hardness  and 
the  inapproximability  of  MIS,  MDS,  and  MVC  on  general  power-law  graphs  and  simple 
power-law  graphs.  These  inapproximability  results  on  power-law  graphs  are  shown  in 
Table  2-1.  Furthermore,  the  inapproximability  results  in  Clique  and  Coloring  are 
shown  by  taking  advantage  of  the  reduction  in  [35].  We  also  analyze  the  relationship 
between  /3  and  constant  greedy  approximation  algorithms  for  MIS  and  MDS. 

In  addition,  due  to  a  lot  of  recent  studies  in  online  social  networks  on  the  influence 
propagation  problem  [54,  55],  we  formulate  this  problem  as  p-Minimum  Dominating  Set 
(p-MDS)  and  show  it  hard  to  be  approximated  within  2  -  (2  +  od(l))  log  log  dj  log  d 
factor  on  d- bounded  graphs  under  unique  games  conjecture,  which  further  leads  to  the 
following  inapproximability  result  on  power-law  graphs  (shown  in  Table  2-1). 

Table  2-1.  Inapproximability  Factors  on  Power-Law  Graphs  with  Exponential  Factor 

P  >  1 


Problem 

General  Power-Law  Graph 

Simple  Power-Law  Graph 

MIS 

I  -L  1  <r 

'  140(2C(P)3/3-l)  fc 

1  4-  1  £ 

^  1120C(/3)3/3 

MDS 

1  -L  1 

'  390(2C(P)3'3-1) 

1-1-  1 

'  3120C(/3)3'3 

MVC,  p-MDS 

1  ,  2(1-(2+o;(1))^||£) 

1  ,  2-(2+oc(l))Sfp 

(cG8)c*+c£Wl) 

1  2C(/3)c/3(c+l) 

Clique 

- 

0  (nV(/3+i)-^ 

Coloring 

- 

0  (nV(/3+i)-^ 

a  Conditions:  MIS  and  MDS:  P^NP;  MVC,  p-MDS:  unique  games  conjecture;  Clique,  Coloring:  NP^ZPP. 
b  c  is  a  constant  which  is  the  smallest  d  satisfying  the  condition  in  [1 0]. 


2.1  Preliminaries 

In  this  section,  we  first  recall  the  definition  of  several  classical  optimization  problems 
and  formulate  the  new  optimization  problem  p-Minimum  Dominating  Set.  Then  the 
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power-law  model  and  some  corresponding  concepts  are  proposed.  At  last,  we  introduce 
some  special  graphs  which  will  be  used  in  the  analysis  throughout  the  whole  paper. 

2.1.1  Problem  Definitions 

Definition  3  (Maximum  Independent  Set).  Given  an  undirected  graph  G  =  (V,E), 
find  a  subset  Sc  1/  with  the  maximum  size  such  that  no  two  vertices  in  5  are  adjacent. 
Definition  4  (Minimum  Vertex  Cover).  Given  an  undirected  graph  G  —  (V,  E ),  find 
a  subset  Sc  1/  with  the  minimum  size  such  that  for  each  edge  E  at  least  one  endpoint 
belongs  to  5. 

Definition  5  (Minimum  Dominating  Set).  Given  an  undirected  graph  G  =  ( V ,  E),  find 
a  subset  5  c  V  with  the  minimum  size  such  that  for  each  vertex  v,  e  V\  S,  at  least  one 
neighbor  of  v,  belongs  to  5. 

Definition  6  (Maximum  Clique).  Given  an  undirected  graph  G  —  ( V ,  E),  find  a 
clique  with  maximum  size  where  a  subgraph  of  G  is  called  a  clique  if  all  its  vertices  are 
pairwise  adjacent. 

Definition  7  (Minimum  Graph  Coloring).  Given  an  undirected  graph  G  =  (V.E), 
label  the  vertices  in  V  with  minimum  number  of  colors  such  that  no  two  adjacent  vertices 
share  the  same  color. 

The  p-Minimum  Dominating  Set  is  defined  as  general  version  of  MDS  problem. 

In  the  context  of  influence  propagation,  the  p-MDS  problem  aims  to  find  a  subset  of 
nodes  with  minimum  size  such  that  all  nodes  in  the  whole  network  can  be  influenced 
within  t  rounds.  In  particular,  a  node  is  influenced  when  p  fraction  of  its  neighbors  are 
influenced.  For  simplicity,  we  define  p-MDS  problem  in  the  case  that  t—  1. 

Definition  8  (p-Minimum  Dominating  Set).  Given  an  undirected  graph  G  —  ( V ,  E), 
find  a  subset  S  c  V  with  the  minimum  size  such  that  for  each  vertex  v,  e  V  \  S, 

\S  n  N(vi)\  >  p\N(vi)\. 
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2.1.2  Some  Notations 


A  great  number  of  models  [2,2,11,1 3,  71  ]  on  power-law  graphs  are  emerging  in 
the  past  recent  years.  In  this  chapter,  we  do  the  analysis  based  on  the  general  (a,  (3) 
model,  that  is,  the  graphs  only  constrained  by  the  power-law  distribution  in  degree 
sequences.  We  first  define  the  following  two  types  of  degree  sequences. 

Definition  9  (y-Degree  Sequence).  Given  a  graph  G  =  (V,  E ),  the y -degree  sequence 

of  G  is  a  sequence  Y  =  (yi,  y2 _ yA)  where  A  is  the  maximum  degree  of  G  and 

y,  =  \{u\u  e  V  A  deg(u)  =  /}|. 

Definition  10  (of- Degree  Sequence).  Given  a  graph  G  —  (V,  E ),  the  d -degree  sequence 
of  G  is  a  sequence  D  —  (d1,d2, ,  dr)  of  vertex  in  non-increasing  order  of  their  degrees. 
Note  that  y-degree  sequence  and  of- degree  sequence  are  interchangeable.  Given 

ay-degree  sequence  Y  =  (yi,y2 _ yA),  the  corresponding  of-degree  sequence  is 

D  —  (A,  A . A  -  1,  A  -  1, ... ,  A  -  1, ... ,  1, ... ,  1)  where  the  number  /  appears  y, 

times.  Because  of  their  equivalence,  we  may  use  only  y-degree  sequence  or  of-degree 
sequence  or  both  without  changing  the  meaning  or  validity  of  results. 

Definition  11  (Continuous  Sequence).  An  integer  sequence  (ofi,  d2 _ dn),  where 

ofi  >  d2  >  ■  ■  ■  >  dn,  is  continuous  /TV  1  <  i  <  n  -  1,  \d,  -  di+1\  <  1. 

Definition  12  (Graphic  Sequence).  A  sequence  D  is  said  to  be  graphic  if  there  exists  a 
graph  such  that  D  is  its  d -degree  sequence. 

Definition  13  (Degree  Set).  Given  a  graph  G,  let  D,(G )  be  the  set  of  vertices  of  degree 
i  on  G. 

Furthermore,  we  define  the  of-bounded  graph  as 
Definition  14  (of- Bounded  Graph).  Given  a  graph  G  =  (V,  E),  G  is  a  d -bounded graph  if 
the  degree  of  any  vertex  is  upper  bounded  by  an  integer  constant  d. 

2.1.3  Special  Graphs 

Definition  15  (d-Regular  Cycle  RC^).  Given  a  vector  d  =  (ofi _ dn),  a  d -regular 

cycle  RC nd  is  composed  of  two  cycles.  Each  cycle  has  n  vertices  and  two  ith  vertices  in 
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A  RCi 


B  2  —  Cg 

Figure  2-1 .  Special  Graph  Examples:  The  left  one  is  a  (3, 3, 3, 3, 3, 3, 3, 3)-regular  cycle 
and  the  right  one  is  a  (3, 3, 3, 3)-branch-(2, 2, 2, 2, 2, 2)-cycle.  The  grey 
vertices  consist  of  the  optimal  solution  of  MDS  on  these  two  special  graphs. 

each  cycle  are  adjacent  with  each  other  by  d,  -  2  multi-edges.  That  is,  d -regular  cycle 
RCjj  has  2 n  vertices  and  the  two  ith  vertex  has  the  same  degree  d,.  An  example  RCj!  is 
shown  in  Figure  2-  1A. 

Definition  16  (^-Branch-d-Cycle  F-BCtj).  Given  two  vectors  d  —  (d1 . dn)  and 

k  =  ( k  i _ Km),  the  R-branch-d -cycle  is  composed  of  a  cycle  with  a  number  of  vertices 

n  such  that  each  vertex  has  degree  d,  as  well  as  \  k\/2  appendant  branches,  where  \k\  is 
a  even  number.  Note  that  any  H-branch-d -cycle  has  \k\  even  number  of  vertices  with  odd 
degrees.  An  example  is  shown  in  Figure  2-1 B. 

2.1.4  Existing  Inapproximability  Results 

Here  we  list  some  inapproximability  results  in  the  literature  to  use  later  in  our  proofs. 

(1 )  In  d- bounded  graphs,  MVC  is  hard  to  be  approximated  into  2-(2+od(l))  log  log  d/  log  d 
for  every  sufficiently  large  integer  d  under  unique  games  conjecture  [10,  20]. 

(2)  In  3-bounded  graphs,  MIS  and  MDS  is  /VP-hard  to  be  approximated  into  -  £  for 
any  e  >  0  and  ||^  respectively  [8]. 

(3)  Maximum  clique  and  minimum  coloring  problem  is  hard  to  be  approximated  into 
nl~e  on  general  graphs  unless  NP=ZPP  [41]. 
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2.2  Inapproximability  Optimal  Substructure  Framework  in  Power-Law  Graphs 

In  this  section,  we  introduce  a  framework  to  derive  the  approximation  hardness 
of  optimal  substructure  problems  on  power-law  graphs.  A  graph  optimization  problem 
is  said  to  satisfy  optimal  substructure  if  its  optimal  solution  is  the  union  of  the  optimal 
solutions  on  each  connected  component.  Therefore,  when  a  graph  G  is  embedded 
into  a  power-law  graph  G',  the  optimal  solution  in  G'  consists  of  a  subset  of  the  optimal 
solution  in  G.  According  to  this  important  property,  we  present  the  Inapproximability 
Optimal  Substructure  Framework  to  prove  the  inapproximability  factor  if  there  exists 
a  Embedded-Approximation-Preserving  Reduction  that  relates  the  approximation 
hardness  in  general  graphs  and  power-law  graphs  by  guaranteeing  the  relationship 
between  the  solutions  in  the  original  graph  and  the  constructed  graph. 

Definition  17  (Embedded-Approximation-Preserving  Reduction).  Given  an  optimal 
substructure  problem  O,  a  reduction  from  an  instance  on  graph  G  —  {  V ,  E)  to  another 
instance  on  a  power-law  graph  G'  =  ( V ,  Er)  is  called  embedded-approximation¬ 
preserving  if  it  satisfies  the  following  properties: 

(1)  G  is  a  subset  of  maximal  connected  components  of  G'; 

(2)  The  optimal  solution  of  O  on  G',  OPT ( G'),  is  upper  bounded  by£OPT(G )  where 
£  is  a  constant  correspondent  to  the  growth  of  the  optimal  solution. 

Theorem  2.1  (Inapproximability  Optimal  Substructure  Framework).  Given  an  optimal 
substructure  problem  O,  if  there  exists  an  embedded-approximation-preserving  reduc¬ 
tion  from  a  graph  G  to  another  graph  G' ,  we  can  extract  the  inapproximability  factor  5  of 
O  on  G'  us/'nge-inapproximability  of  O  on  G,  where  5  is  lower  bounded  by  (g_^£+1  and 
e+<t~1  when  O  is  a  maximum  and  minimum  optimization  problem  respectively. 

Proof.  Suppose  that  there  exists  an  algorithm  providing  a  solution  of  O  on  G'  with  size 
at  most  5  times  the  optimal  solution.  Denote  A  and  B  to  be  the  sizes  of  the  produced 
solution  on  G  and  G'\G  and  A*  and  B*  to  be  their  corresponding  optimal  values.  Hence, 
we  have  B*  <  (£  -  1  )A*.  With  the  completeness  that  OPT(G)  —  A*  =>•  OPT(G')  —  B*, 
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the  soundness  leads  to  the  lower  bound  of  8  which  is  dependent  on  the  type  of  O, 
maximization  or  minimization  problem,  as  follows. 

Case  1:  When  O  is  a  maximization  problem,  we  start  from  the  definition  of 
soundness  as 


+  B*  <  8 (A  +  B) 

(2-1) 

A  <8A  +  {8  -1)B* 

(2-2) 

&  A  <  8A  +  {8-  1)(£-  1)A 

(2-3) 

where  (2-2)  holds  since  B  <  B*  and  (2-3)  holds  since  B*  <  (£-  1  )A. 

On  the  other  hand,  it  is  hard  to  approximate  O  within  e  on  G,  thus  A  >  eA  Replace 
it  to  the  above  inequality,  we  have: 

4*  <  AS/f  +  (6-  1)(£  -  1)4*  «  S  >  ‘iv+i 

Case  2:  When  O  is  a  minimization  problem,  since  B*  <  B ,  similarly 

A+B<  8(A*  +  B*) 

^  A<8A*  +  (8-l)B* 

A  <  8A*  +  (8  -  1)(£  -  1)A* 


Then  from  A  >  eA, 

e  <  5  +  (5  -  1)(£  -  1)  <5  >  £  +  ^~  1 

□ 

2.3  Hardness  and  Inapproximability  of  Optimal  Substructure  Problems 
2.3.1  General  Cycle-Based  Embedding  Technique 

In  this  section,  we  propose  a  General  Cycle-Based  Embedding  Technique  on  (a,  (3) 
power-law  graphs  with  (3  >  1.  The  basic  idea  is  to  embed  an  arbitrary  Abounded  graph 
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into  power-law  graphs  using  a  cl rregular  cycle,  a  ^-branch- c/2-cycle  and  a  number  of 
cliques  K2,  where  d[,  d2  and  k  are  defined  by  a  and  f3.  Before  discussing  the  main 
embedding  technique,  we  first  show  that  most  optimal  substructure  problems  can  be 
polynomially  solved  in  both  d-regular  cycles  and  K-branch-d-cycle.  In  this  context, 
the  cycle-based  embedding  technique  helps  to  prove  the  complexity  of  these  optimal 
substructure  problems  on  power-law  graphs  according  to  their  corresponding  complexity 
results  on  general  bounded  graphs. 

Lemma  1.  MDS,  MVC  and  MIS  are  polynomially  solvable  on  d -regular  cycles. 

Proof.  Here  we  just  prove  MDS  problem  is  polynomially  solvable  on  d-regular  cycles. 
The  algorithm  is  simple.  From  an  arbitrarily  vertex,  we  select  the  vertex  on  the  other 
cycle  in  two  hops.  The  algorithm  will  terminate  until  all  vertices  are  dominated.  Now 
we  will  show  that  this  gives  the  optimal  solution.  LeVakeRCl  as  an  example.  As  shown 
in  Figure  2-1  A,  the  size  of  MDS  is  4.  Notice  that  each  vertex  can  dominate  exact  3 
vertices,  that  is,  4  vertices  can  dominate  exactly  12  vertices.  However,  in  RQ,  there  are 
altogether  16  vertices,  which  have  to  be  dominated  by  at  least  4  vertices  apart  from  the 
vertices  in  MDS.  That  is,  the  algorithm  returns  an  optimal  solution.  The  proof  of  MVC 
and  MIS  is  similar.  □ 

Lemma  2.  MDS,  MVC  and  MIS  is  polynomially  solvable  on  P-branch-ci -cycles. 

Proof.  Again  we  show  the  proof  of  MDS.  First  we  select  the  vertices  connecting  both 
the  branches  and  the  cycle.  Then  by  removing  the  branches,  we  will  have  a  line  graph 
regardless  of  self-loops,  on  which  MDS  is  polynomially  solvable.  It  is  easy  to  see  that 
the  size  of  MDS  will  increase  if  any  one  vertex  connecting  both  the  branch  and  the  cycle 
in  MDS  is  replaced  by  some  other  vertices.  The  proof  of  MIS  is  similar.  Note  that  the 
optimal  solution  for  MVC  consists  of  all  vertices  since  all  edges  need  to  be  covered.  □ 

Theorem  2.2  (Cycle-Based  Embedding  Technique).  Any  d -bounded  graph  Gd  can 
be  embedded  into  a  power-law  graph  G(fKff)  with  3  >  1  such  that  Gd  is  a  maximal 
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component  and  most  optimal  substructure  problems  can  be  polynomial ly  solvable  on 

G(a,/3)  \  Gd ■ 

Proof.  With  the  given  f3,  we  choose  a  to  be  max{ln  maxi<,<d{/i(  ■  P},  jd  In  d}.  Based  on 
t(/)  =  [ea/P\  -  n,  where  n,  =  0  when  /  >  d,  we  construct  the  power-law  graph  G(ai/3) 
as  the  following  Algorithm  1 .  The  last  step  holds  since  the  number  of  vertices  of  odd 
degrees  has  to  be  even.  From  Step  1 ,  we  know  eQ  =  max{maxi<,<d{n,  •  P},  d13}  <  d^n, 
that  is,  the  number  of  vertices  N  in  graph  G(a  /3)  satisfies  N  <  which  means  that 

N/n  is  a  constant.  According  to  Lemma  1  and  Lemma  2,  since  G(ai/3)  \  Gd  is  composed 
of  a  cfi-regular  cycle  and  a  dj-branch-c^-cycle,  it  can  be  polynomially  solvable.  Note  that 
the  number  of  vertices  in  L  is  at  most  A  since  there  is  at  most  one  leftover  vertex  of  each 
degree. 


Algorithm  1:  Cycle  Embedding  Algorithm 

1  a  <—  maxjln  maxi<,<d{n(  •  ip}.P\nd}] 

2  For  r(  1)  vertices  of  degree  1 ,  add  [r(l) /2J  number  of  cliques  K2\ 

3  For  r( 2)  vertices  of  degree  2,  add  a  cycle  with  the  size  r( 2); 

4  For  all  vertices  of  degree  larger  than  2  and  smaller  than  A,  construct  a  di-regular 
cycle  where  di  is  a  vector  composed  of  [r(/')/2 J  number  of  elements  /  for  all  / 
satisfying  r(/)  >  0; 

5  For  all  leftover  isolated  vertices  L  such  that  r(/')  -  2|_t(/)/2J  =  1,  construct  a 

c/j-branch-c/l-cycle,  where  d\  and  c/f  are  the  vectors  containing  odd  and  even 
elements  correspondent  to  the  vertices  of  odd  and  even  degrees  in  L  respectively. 


□ 


2.3.2  APX-Hardness 

In  this  section,  we  prove  that  MIS,  MDS,  MVC  remain  APX-hard  even  on  power-law 
graphs. 

Theorem  2.3.  MDS  is  AP X-hard  on  power-law  graphs. 
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Proof.  According  to  Theorem  2.2,  we  use  the  cycle-based  embedding  technique  to 
show  £-reduction  from  MDS  on  any  d- bounded  graph  Gd  to  MDS  on  a  power-law  graph 
G(Qi/3)  since  MDS  is  proven  APX-hard  on  d- bounded  graphs  [51]. 

Letting  0  be  a  feasible  solution  on  Gd,  we  can  construct  MDS  in  G'  such  that  MDS 
on  a  K2  is  1,  nj 4  on  a  d-regular  cycle  and  nj 3  on  a  cycle  and  a  £-branch-d-cycle. 
Therefore,  for  a  solution  4>  on  Gd,  we  have  a  solution  p  on  G(ai/3)  to  be  p  =  0  +  nx/2  + 
n2/ 3  -l-  n3/4,  where  m,  n2  and  n3  corresponds  to  r(l),  r( 2)  u  L  and  all  leftover  vertices. 
Hence,  we  have  OPT{p )  =  OPT(0)  +  m/2  +  n2/3  +  n3/4. 

On  one  hand,  for  a  d-bounded  graph  with  vertices  n,  the  optimal  MDS  is  lower 
bounded  by  n/(d  +  1).  Thus,  we  know 

OPT(ip)  =  OPT(0)  +  m/2  +  n2/3  +  n3/ 4 

<  OPT(0)  +  (/V  -  n)/2  <  OPT(0)  +  (C(^)cf^  -  l)n/2 

<  OPT {(f))  +  (C(/3)^  -  l)(d  +  l)OPT(0)/2  =  [1  +  (CO Q)dp  -  1  ){d  +  l)/2]  OPT(0) 

where  N  is  the  number  of  vertices  in  G(a  /3). 

On  the  other  hand,  with  \OPT(4>)  -  </>|  =  |OPT(^)  -  yj|,  we  proved  the  ^-reduction 
with  ci  =  1  +  (CO 3)d13  -  1  ){d  -I- 1)/2  and  c2  =  1.  □ 

Theorem  2.4.  MVC  is  AP X-hard  on  power-law  graphs. 

Proof.  In  this  proof,  we  show  ^-reduction  from  MVC  on  d-bounded  graph  Gd  to  MVC  on 
power-law  graph  G(Qi/3)  using  cycle-based  embedding  technique. 

Let  0  be  a  feasible  solution  on  Gd.  We  construct  the  solution  p  <  cf>  +  {N  — 
n)  since  the  optimal  solution  of  MVC  is  n/2  on  K2,  cycle,  d-regular  cycle  and  n  on 
Z?-branch-d-cycle.  Therefore,  since  the  optimal  MVC  on  a  d-bounded  graph  is  lower 
bounded  by  n/(d  +  1),  we  have 

OP7»  <  [1  +  (CG3 )</"  -  l)(rf  +  1)]  OPT(<p) 
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On  the  other  hand,  with  |OPT(0)  -  (f> |  =  | OPT ((f)  -  p\,  we  proved  the  £-reduction 
with  ci  =  1  +  (C(,i3)cl0  -  1  )(d  +  1)  and  c2  =  1.  □ 

Corollary  1 .  A4/S  /'s  AP X-hard  on  power-law  graphs. 

2.3.3  Inapproximability  Factors 

In  this  section,  we  show  the  inapproximability  factors  on  MIS,  MVC  and  MDS  on 
power-law  graphs  respectively  using  the  results  in  section  2.1.4. 

Theorem  2.5.  For  any  e  >  0,  there  is  no  1  +  uo^/^-i)  -  £  approximation  algorithm  for 
Maximum  Independent  Set  on  power-law  graphs. 

Proof.  In  this  proof,  we  construct  the  power-law  graph  G(Qi/3)  based  on  cycle-based 
embedding  technique  in  Theorem  2.2  from  d- bounded  graph  Gd.  Let  </>  and  be  feasible 
solutions  of  MIS  on  Gd  and  G^y  Then  OPT(^)  composed  of  OPT(0),  clique  K2, 
cycle,  d-regular  cycle  and  P-branch-d-cycles  are  all  exactly  half  number  of  vertices. 
Hence,  we  have  OPT(ip)  —  OPT(<J))  +  (N  -  n)/2  where  n  and  N  is  the  number  of 
vertices  in  Gd  and  G^)  respectively.  Since  OPT(0)  >  n/(d  +  1)  on  d- bounded  graphs 
for  MIS  and  N  <  ( {fd)d^n,  we  further  have  £  =  1  +  (CdK-iX^+i)  from 

OPT(ip)  =  OPT(0)  +  /V^?  <  OPT(0)  +  ~  11  n 

<  OPT(0)  +  l)(^  +  l)0pT^ 

=  (l  +  (C(^-I)(d+1))OPT(0) 

According  to  e  =  -  e'  for  any  e'  >  0  on  3-bounded  graphs,  then  the 

inapproximability  factor  can  be  derived  from  inapproximability  optimal  substructure 
framework  as 


5  > 


e£ 


(£-  l)e+  1 


>  1  + 


140£ 


—  s  —  1  + 


140(2C(/3)3P  -  1) 


—  £ 
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where  the  last  step  follows  from  d  =  3. 


□ 

Theorem  2.6.  There  is  no  1  +  3Q0^2^)3P-i)  approximation  algorithm  for  Minimum 
Dominating  Set  on  power-law  graphs. 

Proof.  In  this  proof,  we  construct  the  power-law  graph  G(Qi/3)  based  on  cycle-based 
embedding  technique  in  Theorem  2.2  from  c/-bounded  graph  Gd.  Let  0  and  <p  be  feasible 
solutions  of  MDS  on  Gd  and  G(Qi/3).  The  optimal  MDS  on  OPT(0),  clique  K2,  cycle, 
Aregular  cycle  and  Abranch-Acycles  are  n/2,  n/ 4  and  n/3  respectively.  Let  0  and  p>  be 
feasible  solutions  of  MDS  on  Gd  and  G^py  Then  we  have  £  =  1  +  (C(^)^-i)(tf+i)  Sjmj|ar 
as  the  proof  in  Theorem  2.5. 

According  to  e  =  §§§  in  3-bounded  graphs,  then  the  inapproximability  factor  can  be 
derived  from  inapproximability  optimal  substructure  framework  as 

e-  i  _  1 

>  +  £  +  39O(2C(0)3/?  -  1) 

where  the  last  step  follows  from  d  =  3. 

□ 

2(1— (2+0 

Theorem  2.7.  MVC  is  hard  to  be  approximated  within  1  +  A -  1N  logc  ;  on  power-law 

(C(/J)c0+c?J(c- fl) 

graphs  under  unique  games  conjecture. 

Proof.  By  constructing  the  power-law  graph  G(Qi/3)  based  on  cycle-based  embedding 
technique  in  Theorem  2.2  from  Abounded  graph  Gd,  The  optimal  MVC  on  clique 
K2,  cycle,  Aregular  cycle  are  half  number  of  vertices  while  the  optimal  MVC  on 

_  U{P)dP-l+dV  ](d+l) 

Abranch-d-cycles  are  all  vertices.  Thus,  we  have  £  =  1  +  ^ ^ L - since 
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OPTM  <  OPT(0)  +  W'rA  +  a  <  OPT(0)  +  (C(/3)^-  l)n+n^ 


=  OPT(0)  + 


(W-i+^) 


(a/^-i  +  ^rr^  (rf+i) 

<  OPT (0)  +  A - ^ L - OPT(0) 


<  1  + 


OPT(0) 


(2-4) 

(2-5) 

(2-6) 

(2-7) 


where  </>  and  <p  be  feasible  solutions  of  MVC  on  Gd  and  G(Qi/3),  A  is  the  maximum 
degree  in  G(a,^).  The  inequality  (2-4)  holds  since  there  are  at  most  A  vertices  in 
K-branch-d-cycle,  i.e.  A  =  ea//3  <  n1//3d;  (2-6)  holds  since  there  are  at  least  d  +  1 
vertices  in  a  d- bounded  graph  and  the  optimal  MVC  in  a  d- bounded  graph  is  at  least 

n/(d  +  1). 

According  to  e  =  2  -  (2  +  od(l))  log  log  dj  log  d,  then  the  inapproximability  factor  can 
be  derived  from  inapproximability  optimal  substructure  framework  as 


e-l  2(l-(2  +  oc(l))^) 

>  1  H - — —  A  1  4 - 7 - pr - 

C  (£(&)&  +  C?J  (c  +  1) 

where  c  is  the  smallest  d  satisfying  the  condition  in  [10].  The  last  inequality  holds  since 
function  f  (x)  =  (1  -  (2  +  ox(l))  log  log  x/  log  x)/g(x)(x  +  1)  is  monotonously  decreasing 
when  f(x)  >  0  for  all  x  >  0  when  g{x)  is  monotonously  increasing.  □ 

Theorem  2.8.  p-PDS  is  hard  to  be  approximated  into  2  -  (2  +  oc/(l))'°^c'  on  d -bounded 
graphs  under  unique  games  conjecture. 

Proof.  In  this  proof,  we  show  the  gap-preserving  from  MVC  on  {d/p)- bounded  graph 
G  =  {V,  E)  to  p-PDS  on  d- bounded  graph  G'  =  (V",  E').  w.I.o.g.,  we  assume  that  d 
and  d/p  are  integers.  We  construct  a  graph  G'  =  {V' ,  E')  by  adding  new  vertices  and 
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edges  to  G  as  follows.  For  each  edge  [v,,  vj)  g  E,  create  k  new  vertices  vj _ vjj  where 

1  <  k  <  [l/p\  and  p  <  1/2.  Then  we  add  2k  new  edges  (vjj,  v,)  and  (vjj,  y,)  for  all 
/  g  [1,  k]  as  shown  in  Figure  2-2.  Clearly,  G'  =  (V' ,  E ')  is  a  d-bounded  graph. 


Figure  2-2.  The  Reduction  from  MVC  to  p-MDS 

Let  </>  and  p  be  feasible  solutions  to  MVC  on  G  and  G'  respectively.  We  claim  that 

OPT(0)  =  OPT(p). 

On  one  hand,  if  S  =  {vi,  v2, ... ,  vj}  e  V  is  the  minimum  vertex  cover  on  G.  Then 

{vi,  v2 _ vj}  is  a  p-PDS  on  G'  because  each  vertex  in  V  has  p  of  all  neighbors  in 

MVC  and  every  new  vertex  in  V  \  V  has  at  least  one  of  two  neighbors  in  MVC.  Thus 
OPT(4>)  >  OPT(<p).  One  the  other  hand,  we  can  prove  that  OPT(p)  does  not  contain 
new  vertices,  that  is,  V’  \  V .  Consider  a  vertex  v,  e  V,  if  v,  g  OPT(ip),  the  new  vertices 
v/  for  all  vj  g  N(vj)  and  all  /  g  [1,  k]  are  not  needed  to  be  selected.  If  v,  ^  OPT(ip),  it 
has  to  be  dominated  by  p  proportion  of  its  all  neighbors.  That  is,  for  each  edge  (v,,  v,) 
incident  to  v„  either  v,  or  all  v/  have  to  be  selected  since  every  v/  has  to  be  either 
selected  or  dominated.  If  all  v/  are  selected  in  OPT(p)  for  some  edge  (v,,  Vj),  y,  is  still 
not  dominated  by  enough  vertices  if  there  are  some  more  edges  incident  to  Vj  and  the 
number  of  vertices  vjj  k  is  great  than  1 ,  that  is,  \  l/p\  >  1.  In  this  case,  v ,■  will  be  selected 
to  dominate  all  vjj.  Thus,  OPT(p )  does  not  contain  new  vertices.  Since  the  vertices  in  V 
selected  is  a  solution  to  p-MDS,  that  is,  for  each  vertex  v,  in  graph  G,  v,  will  be  selected 
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or  at  least  the  number  of  neighbors  of  v,  will  be  selected.  Therefore,  the  vertices  in 
OPT (<p)  consist  of  a  vertex  cover  in  G.  Thus  OPT(4>)  <  OPT(f).  Then  we  show  the 
completeness  and  soundness  as  follows. 

•  If  OPT(0)  =  m  =>  OPT (<f)  =  m 

•  If  OPT(cf) )  >  (2  -  (2  +  od(l))'0$ff)  m  =*  OPT(f )  >  (2  -  (2  +  od(l))^)  m 

"ri'>  >  (*  -  -  >(>-«*  - 

since  the  function  f  (x)  =  2  -  log  log  x/  log  x  is  monotonously  increasing  for  any 

x  >  0. 

□ 

2^1  ( 2  I  O  ( 1  ^  ^  log  log  c  \ 

Theorem  2.9.  p-PDS  is  hard  to  be  approximated  into  1  +  2+(c(/^— i)(cTi)  on  Power-law 
graphs  under  unique  games  conjecture. 

Proof.  By  constructing  the  power-law  graph  G^  based  on  cycle-based  embedding 
technique  in  Theorem  2.2  from  c/-bounded  graph  Gd,  According  to  the  optimal  MVC 
on  OPT((p),  clique  K2,  cycle,  d-regular  cycle  and  K-branch-d-cycles,  we  have  £  = 

1  +  from 


OPT  (tp) 


—  OPT  ((f) )  +  rii/2  +  f(p)n2  +  g(p)ri3 
<  opT(<j>)  +  <  (1  +  (C09K-i)(tf  +  i))  oprw 


where  f(p) 


3 


>  9(p) 


<P<\ 


for  all  p  <  3  and  0,  p  be  feasible  solutions 


of  MVC  on  Gd  and  G(Qi/3).  ni,  n2  and  /i3  are  correspondent  to  the  number  of  vertices  in 
cliques  K2,  cycle,  d-regular  cycle  and  K-branch-d-cycle. 
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According  to  e  =  2  -  (2  +  od(  1))  log  log  dj  log  d,  then  the  inapproximability  factor  can 


be  derived  from  inapproximability  optimal  substructure  framework  as 


2(l-(2  +  oc(l))!^) 
-1+  2  +  (C(«c«-l)(c  +  l) 


where  c  is  the  smallest  d  satisfying  the  condition  in  [10].  The  last  inequality  holds  since 
function  f(x)  =  (1  -  (2  +  ox(l))  log  log  xj  log  x)/g(x)(x  +  1)  is  monotonously  decreasing 
when  f(x)  >  0  for  all  x  >  0  when  g(x)  is  monotonously  increasing. 


□ 


2.4  More  Inapproximability  Results  on  Simple  Power-Law  Graphs 
2.4.1  General  Graphic  Embedding  Technique 

In  this  section,  we  introduce  a  general  graphic  embedding  technique  to  embed 
a  d  bounded  graph  into  a  simple  power-law  graph.  Before  presenting  the  embedding 
technique,  we  first  show  that  a  graph  can  be  constructed  in  polynomial  time  from  a  class 
of  integer  sequences. 

Lemma  3.  Given  a  sequence  of  integers  D  —  (d1:  d2, ,  dn)  which  is  non-increasing, 
continuous  and  the  number  of  elements  is  at  least  as  twice  as  the  largest  element  in  D, 
i.e.  n  >  2d!,  it  is  possible  to  construct  a  simple  graph  G  whose  d -degree  sequence  is  D 
in  polynomial  time  0(n2  log  n). 

Proof.  Starting  with  a  set  of  individual  vertices  S  of  degree  0  and  |S|  =  n,  we  iteratively 
connect  vertices  together  to  increase  their  degrees  up  to  given  degree  sequence.  In 
each  step,  the  leftover  vertex  of  highest  degree  is  connected  to  other  vertices  one  by 
one  in  the  decreasing  order  of  their  degrees.  Then  the  sequence  D  will  be  resorted 
and  all  zero  elements  will  be  removed.  The  algorithm  stops  until  D  is  empty.  The  whole 
algorithm  is  shown  as  follows  (Algorithm  2). 

After  each  while  loop,  the  new  degree  sequence,  called  D',  is  still  continuous  and 
its  number  of  elements  is  at  least  as  twice  as  its  maximum  element.  To  show  this,  we 
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Algorithm  2:  Graphic  Sequence  Construction  Algorithm 


Input  :  c/-degree  sequence  D  —  (d1,  d2 _ dn)  where  c/i  >  d2>  .. 

Output:  Graph  H 
i  while  D  /  0  do 


■  >  dn 


2 

3 

4 

5 

6 

7 

8 

9  end 


Connect  vertex  of  d i  to  vertices  of  d2,  d3 _ ddl+1] 

di  <—  0; 

for  /  =  2  to  di  +  1  do 

dj  <—  dj  —  1; 

end 

Sort  D  in  non-increasing  order; 

Remove  all  zero  elements  in  D; 


consider  three  cases:  (1)  If  the  maximum  degree  in  D'  remains  the  same,  there  are  at 
least  c/i  — (—  2  vertices  in  D.  Since  D  is  continuous,  the  number  of  elements  in  D  is  at  least 
di  +  2  +  c/i  -  1,  that  is,  2d1  +  1.  Therefore,  the  number  of  elements  in  D'  is  2d1,  i.e. 
n  >  2di  still  holds.  (2)  If  the  maximum  degree  in  D'  is  decreased  by  1 ,  there  are  at  least 

2  elements  of  degree  di  in  D.  Thus,  at  most  one  element  in  D  will  become  0.  Then  we 
have  n  >  2di  -  2  =  2(d1  -  1).  (3)  If  the  maximum  degree  in  D '  is  decreased  by  2,  there 
are  at  most  two  element  in  D  becoming  0.  Thus,  n  >  2d1  -  3  >  2(c/i  -  2). 

The  time  complexity  of  the  algorithm  is  0(n* 2  log  n )  since  there  are  at  most  n 
iterations  and  each  iteration  takes  at  most  0(n  log  n )  to  sort  the  new  sequence  D.  □ 

Theorem  2.10  (Graphic  Embedding  Technique).  Any  d -bounded  graph  Gd  can  be 
embedded  into  a  simple  power-law  graph  G(nJ))  with  /3  >  1  in  polynomial  time  such  that 
Gd  is  a  maximal  component  and  the  number  of  vertices  in  G(Qi/3)  can  be  polynomial ly 
bounded  by  the  number  of  vertices  in  Gd. 

Proof.  Given  a  d- bounded  degree  graph  Gd  =  (V,  E )  and  (5  >  1,  we  construct  a 
power-law  graph  G(ai/3)  of  exponential  factor  f3  which  includes  Gd  as  a  set  of  maximal 
components.  The  construction  is  shown  as  Algorithm  3. 
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Algorithm  3:  Graphic  Embedding  Algorithm 

1  a  <-  max{^I(ln  4  +  (3  In  d),  In  2  +  In  n  +  (3  In  d}  and  corresponding  G^y 

2  D  be  the  c/-degree  sequence  of  G(Qi/3)  \  Gd\ 

3  Construct  G(Qi/3)  \  Gd  using  Algorithm  2. 


According  to  the  lemma  3,  the  above  construction  is  valid  and  finishes  in  polynomial 
time.  Then  we  show  that  N  is  upper  bounded  by  C(/?)2 d^n,  where  n  and  N  are  the 
number  of  vertices  in  Gd  and  Ga,p  respectively.  From  the  construction,  we  know  either 


a  > 


P 


(3-1 


(In  4  H-  /3  In  d)  =>■  a  >  \n4  +  /3\nd  +  a/(3  =>  >  4e p 


or 

e° 

a  >  ln2  +  lnn  +  /3lnc/=^  —  >2 n 

Therefore,  ^  >  2e?  +  n.  Note  that  is  the  number  of  vertices  of  degree  d.  In 
addition,  G  has  at  most  n  vertices  of  degree  d,  so  D  is  continuous  degree  sequence  and 
has  the  number  of  vertices  at  least  as  twice  as  the  maximum  degree. 

In  addition,  when  n  is  large  enough,  we  have  a  —  In  2  +  In  n  +  fd  In  d.  Hence,  the 
number  of  vertices  N  in  Ga</3  is  bound  as  N  <  ((/ 3)ea  —  2 ((/3)d^n,  i.e.  the  number  of 
vertices  of  Ga^  is  polynomial  bounded  by  the  number  of  vertices  in  Gd. 

□ 


2.4.2  Inapproximability  of  MIS,  MVC  and  MDS 

Theorem  2.1 1 .  For  any  e  >  0,  it  is  N P-hard  to  approximate  Maximum  Independent  Set 
within  1  +  1120C1(/3)3/3  -  e  on  simple  power-law  graphs. 

Proof.  In  this  proof,  we  construct  the  simple  power-law  graph  G(Qi/3)  based  on  graphic 
embedding  technique  in  Theorem  2.10  from  c/-bounded  graph  Gd.  Let  0  and  ip  be 
feasible  solutions  of  MIS  on  Gd  and  G^y  Since  OPT(0)  >  n/(d  +  1)  on  d- bounded 
graphs  and  N  <  2( (/3)dpn,  we  further  have  £  =  2(((3)dis(d  +  1)  from 
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OPT(<p)  <  N  <  2( (P)dpn  <  2((P)dP(d  +  1  )OPT{<j>) 

According  to  e  =  -  e'  for  any  e'  >  0  on  3-bounded  graphs,  then  the 

inapproximability  factor  can  be  derived  from  inapproximability  optimal  substructure 
framework  as 


5  > 


e£ 

(£-l)e+l 


1+140£-1  ^  >  1  +  1120C(/3)3^ 


□ 


Theorem  2.12.  It  is  NP -hard  to  approximate  Minimum  Dominating  Set  within  1  + 
312ocW  on  power-law  graphs. 

Proof.  From  the  proof  of  Theorem  2.1 1 ,  we  have  <£  =  2 ((/5)d/3(d  +  1).  Then  according  to 
e  =  ||^  on  3-bounded  graphs,  we  have 


15  >  1  +  M  -  1  +  3120C(/5)3^ 

□ 


2  f 2  j  q  /*^\\  log  log  c 

Theorem  2.13.  There  is  no  1  +  approximation  algorithm  of  Minimum 

Vertex  Cover  on  power-law  graphs  under  unique  games  conjecture. 


Proof.  Similar  as  the  proof  of  Theorem  2.12,  we  have  £  —  2C,{fd)d0{d  +  1).  Then 
according  to  e  =  2  -  (2  +  od(l))  log  log  dj  log  d,  then  the  inapproximability  factor  can  be 
derived  from  inapproximability  optimal  substructure  framework  as 


5  >  1  + 


>  1  + 


2-(2  +  oc(l))^i°gc 


log  c 


2((/3)cP(c  +  1) 


where  c  is  the  smallest  d  satisfying  the  condition  in  [10]. 


□ 
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2  (2+o  c 

Theorem  2.14.  There  is  no  1  +  Ect/^c^c+i)^  approximation  algorithm  for  Minimum 


Positive  Dominating  Set  on  power-law  graphs. 


Proof.  Similar  as  Theorem  2.14,  the  proof  follows  from  Theorem  2.8. 


□ 


2.4.3  Maximum  Clique,  Minimum  Coloring 

Lemma  4  (Ferrante  et  al.  [35]).  Let  G  =  ( V,  E)  be  a  simple  graph  with  n  vertices  and 
/3  >  1.  Let  a  >  max{4/3,  /3  log  n  +  log(n  +  1)}.  Then,  G2  =  G\G1  is  a  bipartite  graph. 
Lemma  5.  Given  a  function  f{x)  (x  e  Z,  f{x)  e  Z+)  monotonously  decreases,  then 

Ex  f(x)  <  L  fM- 

Corollary  2.  eaJ2,=i  <  (e"  -  ea/0)/(/3  -  1). 

Theorem  2.15.  Maximum  Clique  cannot  be  approximated  within  O  on  large 

power- law  graphs  with  (3  >  1  and  n  >  54  for  any  e  >  0  unless  NP=ZPP. 

Proof.  In  [35],  the  authors  proved  the  hardness  of  Maximum  Clique  problem  on 
power-law  graphs.  Here  we  use  the  same  construction.  According  to  Lemma  27, 

G2  =  G  \  Gi  is  a  bipartite  graph  when  a  >  max{4 f3,  /3  log  n  +  log(n  +  1)}  for  any  /3  >  1. 
Let  0  be  a  solution  on  general  graph  G  and  y>  be  a  solution  on  power-law  graph  G2.  We 
show  the  completeness  and  soundness. 

•  If  OPT(0)  =  m  =>  OPT {<p)  =  m 

If  OPT{+)  <  2  on  graph  G,  we  can  solve  clique  problem  in  polynomial  time  by 
iterating  the  edges  and  their  endpoints  one  by  one.  However,  G  is  not  a  general 
graph  in  this  case.  w.I.o.g,  assuming  OPT(0)  >  2,  then  OPT(ip)  =  OPT(0)  >  2 
since  the  maximum  clique  on  bipartite  graph  is  2. 

•  If  OPT(<p)  <  m/n 1"e  =>  OPT(ip)  <  O  (l/(A/1/^+1)-£'))  m 

In  this  case,  we  consider  the  case  that  4 (3  <  /3  log  n  +  log(n  +  1),  that  is,  n  >  54. 
According  to  Lemma  27,  let  a  —  /?  log  n  +  log(n  +  1).  From  Corollary  2,  we  have 


rP(n  +  1)  —  n(n  +  l)1^  ^  2++1  —  n 
13-1  <  13-1 


□ 
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Corollary  3.  Minimum  Coloring  problem  cannot  be  approximated  within  O  (n1/^1)-6) 
on  large  power-law  graphs  with  (3  >  1  and  n  >  54  for  any  e  >  0  unless  NP=ZPP. 

2.5  Relationship  between  [3  and  Approximation  Hardness 
As  shown  in  previous  sections,  many  hardness  and  inapproximability  results  are 
dependent  on  (3.  In  this  section,  we  analyze  the  hardness  of  some  optimal  substructure 
problems  based  on  f3  by  showing  that  trivial  greedy  algorithms  can  achieve  constant 
guarantee  factors  for  MIS  and  MDS. 

Lemma  6.  When  6  >  2,  the  size  of  MDS  of  a  power-law  graph  is  greater  than  Cn  where 
n  is  the  number  of  vertices,  C  is  some  constant  only  dependent  on  [3. 


Proof.  Let  S  =  (vi,  v2 _ vt )  of  degrees  dlt  d2 _ dt  be  the  MDS  of  power-law  graph 

G(a  /3).  Observing  that  the  total  degrees  of  vertices  in  dominating  set  must  be  at  least  the 
number  of  vertices  outside  the  dominating  set,  we  have  Yl'Hl  d,  >  \  V  \S\.  With  a  given 
total  degree,  a  set  of  vertices  has  minimum  size  when  it  includes  the  vertices  of  highest 
degrees.  Since  the  function  ((/3  -  1)  =  w=i  converges  when  (3  >  2,  there  exists  a 
constant  t0  =  t0([3)  such  that 


i=to  L  J  /=  1 


where  a  is  any  large  enough  constant.  Thus  the  size  of  MDS  is  at  least 


i=to 


(  to-1 

(cot  -  E 


e“  ss  c\V\ 


where  C  =  (((/))  -  El,  w)/((W)-  □ 

Consider  the  greedy  algorithm  which  selects  from  the  vertices  of  the  highest  degree 
to  the  lowest.  In  the  worst  case,  it  selects  all  vertices  with  degree  greater  than  1  and  a 
half  of  vertices  with  degree  1  to  form  a  dominating  set.  The  approximation  factor  of  this 
simple  algorithm  is  a  constant. 
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Corollary  4.  Given  a  power-law  graph  with  (3  >  2,  the  greedy  algorithm  that  selects 
vertices  in  decreasing  order  of  degrees  provides  a  dominating  set  of  size  at  most 

Sf=2  \ea/P\  +  \ea  «  (C (/?)  -  l/2)e“.  Thus  the  approximation  ratio  is  (C ((3)  -  \)/{C,{(3)  - 
E£i  1//0- 

Let  us  consider  another  maximization  problem  MIS,  we  propose  a  greedy  algorithm 
Power-law-Greedy-MIS  as  follows.  We  sort  the  vertices  in  non-increasing  order  of 
degrees  and  start  checking  from  the  vertex  of  lowest  degree.  If  the  vertex  is  not  adjacent 
to  any  selected  vertex,  it  is  selected.  The  set  of  selected  vertices  forms  an  independent 
set  with  the  size  at  least  a  half  the  number  of  vertices  of  degree  1  which  is  e“/2.  The 
size  of  MIS  is  at  most  a  half  of  number  of  vertices.  Thus,  the  following  lemma  holds. 
Lemma  7.  Power-law-Greedy-MIS  has  factor  l/(2(((3))  on  power-law  graphs  with  /3  >  1. 
2.6  Minor  A/P-Hardness  on  Simple  Power-Law  Graphs  for  [3  <  1 
In  the  section,  we  show  some  minor  A/P-hardness  of  optimal  substructure  problems 
on  simple  power-law  graphs  for  small  f3  <  1. 

Definition  18  (Eligible  Sequences).  A  sequence  of  integers  S  =  (s1 _ sn)  is  eligible  if 

si  >  s2  >  ...  >  sn  and  fs(k )  >  0  for  all  k  e  [n],  where 

n  k 

fs(k )  =  k(k  -  1)  +  ^2  min{k ,  s,}  -  22  s' 

i=k+ 1  /'=  1 

Erdos  and  Gallai  [31]  showed  that  an  integer  sequence  is  graphic  -  d- degree 
sequence  of  an  graph,  if  and  only  if  it  is  eligible  and  the  total  of  all  elements  is  even. 

Then  Havel  and  Hakimi  [16]  gave  an  algorithm  to  construct  a  simple  graph  from  a 
degree  sequence.  We  now  prove  the  following  eligible  embedding  technique  based  on 
this  result. 

Theorem  2.16  (Eligible  Embedding  Technique).  Given  an  undirected  simple  graph 
G  =  (V,  E),  0  <  (3  <  1,  there  exists  polynomial  time  algorithm  to  construct  a  power-law 
graph  G'  =  (W,  E')  of  exponential  factor  (3  such  that  G  is  a  set  of  maximal  components 
of  G' . 
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Proof.  To  construct  G',  we  choose  a  —  max{0  In (n  -  1)  +  I n(n  +  2),  3  In  2}.  Then 
[eQ/((n  -  l)^)]  >  n  +  2,  i.e.  there  are  at  least  2  vertices  of  degree  d  in  G'  \  G  if  there 
are  a  least  2  vertices  of  degree  d  in  G'.  According  to  the  definition,  the  total  degrees  of 
all  vertices  in  G'  and  G  are  even.  Therefore,  the  lemma  will  follow  if  we  prove  that  the 
degree  sequence  D  of  G'  \  G  is  eligible. 

In  D,  the  maximum  degree  is  [eQ//3J.  There  is  only  one  vertex  of  degree  /  if  1  < 
e"//T  <  2,  i.e.  ea/P  >  i  >  (ea/2)1//3. 

Let  us  consider  fD(k )  in  two  cases: 

Case  1 :  k  <  |_ea//3/2j 


> 


n  k 

fD(k )  =  k(k  -  1)  +  ^  min{k,  d,}  -  ^  d, 

i=k+ 1  /= 1 


T —k 


k- 1 


B-l 


i=k 


i=B 


i= 1 


/=1 


k(T-k)  +  (k-  B)(k  -  1  +  6)/2  +  6(B  -  1)  -  k(2T 
( B 2  -  B)/ 2  -  k 


k  +  l)/2 


where  T  =  [eo//3J  and  B  =  [(e"/2)1//3J  +  1.  Note  that  a//3  >  In  2  (2/p  +  1)  since 
a  >  3  In  2  and  0  <  ft  <  1.  Hence  ([(e"/2)1//3J  +  l)  ([(e"/2)1//3J)  >  |_eo//3J  >  2/c,  that  is, 
fD(k )  >  0. 

Case  2:  k  >  [ea//3/2j 


fo(k  +  1)  >  fo(k)  +  2 k  —  2dk+1  >  fb(^)  >  ^  fb(  |_e"//3/2j )  >  0 

□ 

Corollary  5.  An  optimal  substructure  problem  is  also  N P-hard  on  power-law  graphs  for 
alio  <  d  <  1  if  it  is  NP-hard  on  simple  general  graphs. 
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Proof.  According  to  Theorem  2.16,  we  can  embed  an  undirected  graph  G  =  (V,  E) 
into  a  power-law  graph  G'  of  /3  lying  in  (0, 1)  and  of  vertices  polynomial  time  in  the 
size  of  G.  Since  the  optimization  problem  has  optimal  substructure  property  and  G  is 
a  set  of  maximal  connected  components  of  G',  its  optimum  solution  for  the  graph  G 
can  be  computed  easily  from  an  optimal  solution  for  G'.  This  completes  the  proof  of 
A/P-hardness.  □ 

2.7  Approximation  Algorithms 

As  the  computational  hardness  and  inapproximability  results  of  classic  optimization 
problems  have  been  shown  in  the  previous  sections,  the  design  of  approximation 
algorithm  is  still  of  great  interest  but  remains  open.  In  this  section,  we  focus  on 
addressing  the  following  questions:  Can  the  property  of  power-law  degree  distribution 
help  us  to  design  an  effective  algorithm  framework  for  NP-hard  optimization  problems? 
How  can  we  provide  a  theoretical  framework  for  analyzing  approximation  ratios  of  these 
problems  using  this  power-law  degree  property?  Will  these  approximation  ratios  change 
dramatically  for  different  exponential  factors  (3,  i.e.  in  power-law  graphs  with  different 
densities? 

We  propose  an  algorithm  framework,  called  Low-Degree  Percolation  (LDP) 
framework,  to  solve  the  optimization  problems  in  power-law  networks,  including 
MIS,  MDS,  and  MVC  problems.  The  idea  of  LDP  framework  to  percolate  the  graph 
starting  from  a  great  number  of  low-degree  nodes  in  a  power-law  graph,  allows  us  to 
develop  a  theoretical  framework,  which  can  be  used  to  analysis  the  approximation  ratios 
via  probability  theory.  In  particular,  we  apply  this  theoretical  framework  to  show  the 
approximation  ratios  for  these  problems  on  two  well-known  random  power-law  models 
in  [2,  21].  At  last,  numerical  analysis  of  our  proposed  approaches  not  only  validates  our 
theoretical  analysis  but  also  illustrates  the  effectiveness  of  our  approaches  in  practice. 
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2.7.1  Low-Degree  Percolation  (LDP)  Algorithm  Framework 

In  this  section,  we  proposed  an  algorithm  framework  to  solve  optimization  problems 
by  taking  advantage  of  the  degree  sequence  property  in  power-law  graphs.  As  one 
can  see,  the  most  fundamental  property  of  power-law  graphs  are  that  they  contain  a 
great  number  of  low-degree  nodes,  while  only  a  small  number  of  high-degree  nodes. 
Therefore,  the  idea  of  our  proposed  Low-Degree  Percolation  (LDP)  algorithm  framework 
is  to  sort  the  nodes  by  their  degree  and  percolate  the  graph  from  the  nodes  of  lowest 
degree.  The  process  continues  in  residual  graph  iteratively  until  no  more  nodes,  which 
are  surely  in  optimal  solution,  can  be  detected.  At  last,  we  apply  existing  approximation 
approaches  to  detect  the  solution  in  the  remaining  graph. 

For  MDS  and  MVC  problems,  as  shown  in  Algorithm  4,  since  the  node  incident  to 
a  node  of  degree  1  certainly  belongs  to  an  optimal  solution,  we  percolate  the  graph  by 
adding  all  the  neighbors  of  nodes  with  degree  1  in  each  iteration.  Until  no  more  nodes  of 
degree  1  exists  in  residual  graph,  we  apply  existing  approximation  algorithm  in  [83]  for 
MDS  (or  [52]  for  MVC)  to  obtain  the  solution  in  this  residual  graph. 


Algorithm  4:  LDP  Algorithm  for  MDS/MVC  Problems 


1 

2 

3 

4 

5 

6 

7 

8 
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Input  :  Power-law  graph  G 
Output:  MDS  (or  MVC)  S 
while  3  Nodes  of  degree  1  do 
foreach  Node  v  of  degree  1  do 
Add  its  neighbor  N(v)  into  S; 

Remove  v  from  G; 
end 

Remove  all  nodes  incident  to  S  from  graph  G; 

end 

Determine  the  leftover  MDS  (or  MVC)  in  G  using  existing  approximation  algorithm 
in  [83]  (or  [52])  and  add  them  into  S; 

return  S; 


On  the  other  hand,  Algorithm  5  shows  the  algorithm  for  MIS.  In  this  case,  the  nodes 
of  degree  1  will  belong  to  the  optimal  solution,  and  in  the  meanwhile,  it  is  certain  that 
their  neighbors  cannot  be  in  optimal  solution  any  more.  Therefore,  in  order  to  obtain 
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MIS,  we  select  all  nodes  of  degree  1  into  the  solution  in  each  iteration.  At  last,  we  apply 
the  approximation  algorithm  in  [40]  to  obtain  the  MIS  in  the  remaining  graph. 


Algorithm  5:  LDP  Algorithm  for  MIS  Problem 


1 

2 

3 

4 

5 

6 

7 

8 


Input  :  Power-law  graph  G 

Output:  MIS  S 

while  3  Nodes  of  degree  1  do 
foreach  Node  v  of  degree  1  do 
Add  i/  into  S; 

Remove  v  and  all  its  neighbors  N(v)  from  G; 

end 

end 

Determine  the  leftover  MIS  in  G  using  existing  approximation  algorithm  in  [40] 
and  add  them  into  S; 

return  S; 


Here,  we  note  that  in  a  special  case  that  two  nodes  of  degree  1  are  connected,  the 
optimal  solution  of  MDS  (or  MVC,  MIS)  contains  either  one  of  them. 

2.7.2  Approximation  Ratio  Analysis 

In  this  section,  we  show  the  approximation  ratio  analysis  of  LDP  Algorithms  in 
both  structural  and  expected  random  power-law  networks.  To  do  this,  we  first  provide  a 
theoretical  framework,  using  LDP  algorithm,  to  analyze  the  approximation  ratio  based 
on  the  probability  that  a  node  does  not  connect  to  any  node  of  degree  1 .  Then,  this 
framework  is  applied  to  show  the  ratio  of  optimization  problems  in  two  different  models. 
2.7.2.1  Theoretical  framework 

In  this  theoretical  framework,  as  the  connected  component  of  size  2  is  trivial,  we 
mainly  focus  on  the  ratio  analysis  in  the  rest  part  of  power-law  graphs.  To  begin  with,  we 
first  provide  a  formal  proof  of  the  following  Lemma  8  (Similar  argument  for  Corollary  6), 
which  has  been  briefly  discussed  the  LDP  algorithms. 

Lemma  8.  In  the  optimal  solution  to  MDS  and  MVC,  if  we  do  not  consider  the  case  of 
connected  components  with  size  2,  there  do  not  exist  any  nodes  of  degree  1  and  all 
nodes  incident  to  at  least  one  node  of  degree  1  are  selected. 
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Proof.  In  the  proof,  let  u  be  a  node  of  degree  1  incident  to  another  v  of  arbitrary  degree 
larger  than  1,  we  consider  several  cases:  (1)  If  neither  u  and  v  is  selected  in  optimal 
solution,  no  neighbor  is  select  for  u  and  u  is  not  selected  as  well,  this  leads  to  an 
infeasible  solution;  (2)  If  both  u  and  v  are  selected,  it  is  easy  to  see  that  the  solution  is 
no  more  optimal;  (3)  If  u  is  selected  instead  of  v,  we  have  to  select  a  set  of  nodes  to 
satisfy  v  if  v  has  degree  no  less  than  2;  (4)  If  v  is  selected  instead  of  u,  both  u  and  v  are 
already  satisfied,  which  means  the  size  of  the  solution  less  than  the  size  in  a  solution 
containing  u.  According  to  these  observations,  the  proof  is  complete.  □ 

Corollary  6.  In  the  optimal  solution  to  MIS,  if  we  do  not  consider  the  case  of  connected 
components  with  size  2,  all  nodes  of  degree  1  and  all  nodes  incident  to  at  least  one 
node  of  degree  1  are  selected. 

Next,  we  define  p(a,  (3,  /)  to  be  the  probability  that  a  node  v  of  degree  /  not  incident 
to  any  nodes  of  degree  1  in  a  power-law  graph  Our  purpose  is  to  analyze  the 
approximation  ratio  based  on  p{a,  f3,  /')  in  this  graph  G(a  /3). 

Let  X,u  be  a  random  variable  that  a  node  u  of  degree  /  does  not  connect  to  any 
nodes  of  degree  1.  Then,  we  have 

(  1,  u  e  Di 

K  =  { 

0,  u  rjL  D\ 

where  D±  is  a  set  of  nodes  incident  to  at  least  one  node  of  degree  1 .  Note  that  for  all 
nodes  of  the  same  degree,  they  have  the  same  random  variables.  For  simplicity,  we 
define  X,  to  be  a  random  variable  that  some  node  of  degree  I.  Therefore,  we  have  the 
expected  value  of  node  u  not  incident  to  any  nodes  of  degree  1  as 

E(X, )  =  p(a,  (3,  /) 

Since  the  number  of  nodes  of  degree  /  is  equal  to  ea/P,  by  letting  A  =  ea//3  and 
X  =  Y,t=2  t jX-h  we  have  the  following  lemma: 


45 


Lemma  9.  The  expected  number  of  nodes  of  degree  no  less  than  2  not  incident  to  any 
nodes  of  degree  1  is 

A '  e“ 

Y  -jphiU’P’  0 

i—2 

Proof.  The  expected  number  of  nodes  not  incident  to  any  nodes  of  degree  1  is  the  sum 
of  all  nodes  of  degree  no  less  than  2,  i.e.  X  =  J2t=2  wxi-  Then  we  have 

E(X)  =  £  7? E(X)  =  Y1  7? **(«. ».  0 
/'= 2  /= 2 

□ 


Lemma  10.  The  variance  of  X  is  upper  bounded  by 

2 q  \f . X(a<  P<  Pj) 

2^2^  ( ij)t> 

where  X(a,  p,  i )  =  p(a,  p,  /) ( 1  —  p(a,  p,  /)). 

Proof.  For  a  random  variable  corresponds  to  a  node  of  degree  /  not  incident  to  any 
nodes  of  degree  1 ,  the  variance  is 

Var(X,)  =  (l  -  )  (l  -  (l  -  =  nipt,  p,  /)( 1  -  //(a,  (3,  /)) 

For  any  two  variables  correspond  to  two  nodes  of  degree  /  and  j  not  incident  to  any 
nodes  of  degree  1 ,  according  to  Cauchy-Schwarz  Inequality,  we  have 

| Cov(Xj,  Xj)\  <  y/VarMVar&j)  =  Vx(<*.  P,  i)X((*.  PJ) 


Then,  we  sum  them  up  and  obtain 


\Cov(Xi,Xj)\  <  Y  y/Var(Xi)Var(Xj) 

Xi.Xj  Xi.Xj 

A  „  A 


A  A 


-  ^ ( S TsVxte- f3' pj))  =  e2aJ2Yl 

i—2  j= 2  J  /= 2  j=2 


\/x(oi,P,  i)x(ot,p,j) 

( UV 
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Therefore,  we  have  the  variance  of  X  to  be 


Var(X)  =  £x„x,  I  Cov(Xr.  X,)\  <  e2“  2 


2 a  Y^^  x(a’P'i)x(a’0J) 


<i= 2 


(yX 


□ 


Lemma  1 1 .  The  number  of  nodes  of  degree  no  less  than  2  which  is  not  incident  to  any 
nodes  of  degree  1  is  larger  than  A  ]Tf=2  C  with  probability  at  most 


X/L2  JP  I 


V^A  V^A  y  x(a’0  .i)x(.a’P  J) 

^i= 2  X/=2  (Jj)l 3 


+  1 


Proof.  Let  o  =  A  C,  according  to  One-Sided  Chebyshev  Inequality, 


Pr[X  ></>]  =  Pr  X  -  E (X)  > 


E(X) 


^Var{X) 


y/VarfX) 


< 


< 


Var(X) 


+  1 


E,=2jrl  a-mOXO 


V^A  v^A  Vx(a./3.0x(<x,/3  J) 

^'=2  ^J=2 


+  1 


□ 


For  simplicity,  we  define  the  following  pA  and  obtain  the  Corollary  7. 


1 

(e12^(a-x«X'))) 

V^A  v^A  y/x(o‘.0.i)x(,o‘,0,j) 

Zx,=2  2^=2  (i/)/3 

Corollary  7.  The  number  of  nodes  of  degree  no  less  than  2  incident  to  at  least  one  node 
of  degree  1  is  at  least  (1  -  A)  ]Tf=2  Xr  with  probability  at  least  1  -  pA. 

Then,  based  on  Lemma  8,  we  derive  the  following  approximation  ratios  of  MDS  and 
MVC  in  a  power-law  graph  G^y 
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Theorem  2.17  (Main  Theorem  (MDS&MVC)).  In  a  power-law  graph  by  using 

Algorithm  4,  MDS  and  MVC  can  be  approximated  into 


1  +  (i|/  -  1)A 


with  probability  at  least  1  -  px,  where  v|/  is  the  approximation  ratio  of  MDS  (or  MVC)  in 
Algorithm  [83]  (or  [52])  w.r.t.  a  graph  of  size  at  most  e"  )T)f=2  jp. 

Proof.  Let  t  be  the  number  of  nodes  incident  to  degree  1  in  some  power-law  graph 
G(ai/ 3).  We  have  the  approximation  ratio  as 


t  +  VOPT  ^  e  + 

I+OPT  -  e  +  zUlt-e 


According  to  Corollary  7,  we  have  i  >  Y^=i  •?  -  A  Ylt=2  w  with  Pr°bability  at  least  1  -  px. 


The  proof  is  complete. 


□ 


In  terms  of  MIS,  we  have  the  approximation  ratio  as  follows: 

Theorem  2.18  (Main  Theorem  (MIS)).  In  a  power- 1  aw  graph  G(q>/ a),  by  using  Algorithm  5, 
MIS  can  be  approximated  into 


with  probability  at  least  1  -  px,  where  N  is  the  number  of  nodes  with  degree  1,  G  is  the 
approximation  ratio  of  MIS  in  Algorithm  [40]  w.r.t.  a  graph  of  size  at  most  e“  )T)f=2  4. 

The  proof  is  omitted  due  to  its  similarity  of  the  proof  in  Theorem  2.17. 

Next,  we  focus  on  applying  this  framework  onto  PLRG  model  and  analyzing  the 
approximation  ratios. 

2.7.2.2  Power-law  random  graph 

In  PLRG  graph,  the  straightforward  computation  of  Pplrg^’  ') is  intractable 
due  to  the  difficulty  to  calculate  all  possible  combinations.  To  this  end,  we  consider  each 
case  that  there  are  particular  number  of  connected  components  of  size  2  in  PLRG.  At 
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last,  the  approximation  factors  can  be  derived  from  the  law  of  total  probability.  In  the 
rest  of  this  subsection,  we  show  the  probability  to  have  r  connected  component  of  size 
2  in  a  PLRG  graph  and  each  resPectivelY>  ancl  apply  them  to  obtain  the 

approximation  ratios. 

Lemma  12.  The  probability  Pr[C2  =  r\  that  there  are  r  connected  components  of  size  2 
in  a  PLRG  graph  is 

(“)(2r)!!C:")(W-2r)! 

N\\/(N-2w  -  1  +  2t)!! 

where  N  —  ea((/3  -  1),  w  =  ea  is  the  size  of  nodes  of  degree  1. 


Proof.  In  order  to  have  r  connected  component  of  size  2,  2 r  mini-nodes  are  selected 
first  from  all  w  nodes  of  degree  1.  Moreover,  there  are  (2 r  -  1)!!  possibilities  to  match 
these  2 r  mini-nodes.  Since  the  number  of  perfect  matching  f(n)  for  n  mini-nodes  is 
(n  -  1)1 !,  the  probability  can  be  calculated  by  simplifying  the  following  equation. 


Pr[C2  —  t] 


©  (2  t)  ! !  (NwZl)  ( w  -2r)\f(N-2w  +  2r) 
f(N) 


□ 


Lemma  13.  In  a  PLRG  graph  G,  if  there  are  r  connected  component  of  size  2,  the 
probability  that  a  node  v  of  degree  i  not  incident  to  any  nodes  of  degree  1  is 


PpLRG^a’  P'  ^ 


nw-l  NT -i-wT -k 
k= 0  NT-wT-k  ’ 


If  Nr  -  I  -wT  >  wr; 


0, 


otherwise. 


where  NT  —  ea((/3  -  1)  -  2 r,  wT  —  ea  -  2 r  is  the  size  of  nodes  of  degree  1. 

Proof.  Let  D1  be  a  set  of  nodes  incident  to  at  least  one  node  of  degree  1 .  Consider 
that  the  whole  mini-nodes  are  composed  of  three  subsets,  i.e.,  /  nodes  correspondent 
to  v,  wT  nodes  correspondent  to  all  nodes  of  degree  1  and  all  leftover  nodes,  which  is 
referred  to  as  N,  and  Nw  and  Nr  \{N,u  Nw}  respectively.  When  NT  -  /  -  wT  <  wT,  there 
are  not  enough  mini-nodes  to  match  all  nodes  of  degree  1,  the  probability  that  v  qL  D±  is 
0.  Otherwise,  in  order  for  v  qL  Du  we  have  to  select  the  nodes  incident  to  all  nodes  in  Nw 
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from  N\{NjU  Nw}. 


NT  -  i  -wT  -  k 
Nt  -  wT  -  k 


Pr[v  £  D1\ 


{NT-;-wT)w^.f(N--2wn  _tvn: 
wT\f(NT  —  2wT)  1}Q 


where  f(n)  —  {n  -  1)!!,  representing  the  number  of  perfect  matching  for  n  nodes. 


□ 


Theorem  2.1 9.  In  a  PLRG  graph  G,  by  using  Algorithm  4,  MDS  and  MVC  can  be 
approximated  into 

1  +  (i|/  -  1)A 


with  probability  at  least 

Le“/2j 

II  Pr[cr2  =  t](i  -  pi) 

T— 0 

where  p\  =  ,  in  which  XT  =  A  +  ^x. 

X(c«,/3,2)  +1  Z"'“2 

Proof.  Consider  one  case  that  there  are  r  connected  components  in  the  power-law 
graph.  Thus,  according  to  Theorem  2.17,  the  probability  that  the  approximation  ratio  is 
smaller  than  i  +  (\|/-l)A  is  1  -p\  for  p\  =  where  Ar  =  A+ x.  Therefore, 

X(<*.P.2)  1-1  l^i=2 

according  to  the  law  of  total  probability,  the  theorem  follows  by  taking  into  account  all  r, 
which  ranges  from  0  up  to  [ea/2\.  □ 


For  MIS  problem,  the  approximation  ratio  can  be  obtained  as 

N  +  e“(AEf.2;y) 

P+hHyPui) 

with  probability  at  least  nl=o/2J  Pric2  =  t](1  -  pTx),  where  pTx'  =  {XT,_^02))2  in  which 

x(«./3.2)  +1 

AT/  =  A  +  Ar  1  . 

^'■=2  7? 

Numerical  Analysis 

Fig.  2-3  illustrates  the  performance  of  our  LDP  algorithms  in  random  power-law 
graphs,  along  with  the  relation  between  different  (3  and  the  corresponding  approximation 
ratios,  from  both  theoretical  and  practical  perspectives. 
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Figure  2-3.  Numerical  results  of  our  LDP  algorithms  on  different  (5  {a  =  5):  (1) 

Theoretical  results  shows  the  approximation  ratios  with  probability  at  least 
1  -  o(l).  As  one  can  see,  our  LDP  algorithms  can  obtain  the  optimal 
solution  for  all  these  problems  after  (3  gets  larger  than  1 .6  and  1 .7  in  ERPL 
and  SRPL  respectively,  which  covers  the  range  of  (3  in  most  real-world 
networks  [18].  For  the  other  smaller  exponential  factors  [3,  we  can  see  that 
the  approximation  ratios  are  a  little  bit  higher,  especially  up  to  5  for  MDS  and 
MIS  problems  for  SRPL  model.  However,  the  probabilities  that  these  two 
problems  can  obtain  the  approximation  ratios  less  than  1.5  using  LDP 
algorithms  are  at  least  0.95  (only  a  little  bit  lower  than  1  -  o(l)).  (2) 
Experimental  results  further  reveals  that  our  LDP  algorithms  can  achieve 
even  better  solutions  than  theoretical  bounds.  (We  tests  on  100  cases  and 
choose  the  average.)  As  illustrated  in  Fig.  2-3,  the  approximation  ratios  of  all 
MDS,MVC,MIS  problems  is  no  larger  than  1 .2  and  2.5  even  when  (3  =  1.3  in 
ERPL  and  SRPL  models  respectively. 

2.8  Related  Works 

Many  experimental  results  on  random  power-law  graphs  give  us  a  belief  that 
the  problems  might  be  much  easier  to  solve  on  power-law  graphs.  Eubank  etal.  [32] 
showed  that  a  simple  greedy  algorithm  leads  to  a  1  +  o(l)  approximation  factor  on 
Minimum  Dominating  Set  (MDS)  and  Minimum  Vertex  Cover  (MVC)  on  power-law 
graphs  (without  any  formal  proof)  although  MDS  and  MVC  has  been  proved  /VP-hard 
to  be  approximated  within  (1  -  e)  log  n  and  1.366  on  general  graphs  respectively 
[28].  In  [73],  Gopal  also  claimed  that  there  exists  a  polynomial  time  algorithm  that 
guarantees  a  1  +  o(l)  approximation  of  the  MVC  problem  with  probability  at  least 
1  -  o(l).  Unfortunately,  there  is  no  such  formal  proof  for  this  claim  either.  Furthermore, 
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several  papers  also  have  some  theoretical  guarantees  for  some  problems  on  power-law 
graphs.  Gkantsidis  et  al.  [36]  proved  the  flow  through  each  link  is  at  most  0(n  log2  n) 
on  power-law  random  graphs  where  the  routing  of  0(dudv )  units  of  flow  between  each 
pair  of  vertices  u  and  v  with  degrees  du  and  dv.  In  [36],  the  authors  take  advantage  of 
the  property  of  power-law  distribution  by  using  the  structural  random  model  [2,  2]  and 
show  the  theoretical  upper  bound  with  high  probability  1  -  o(l)  and  the  corresponding 
experimental  results.  Likewise,  Janson  et  al.  [48]  gave  an  algorithm  that  approximated 
Maximum  Clique  within  1  -  o(l)  on  power-law  graphs  with  high  probability  on  the 
random  poisson  model  G(n,  a)  (i.e.  the  number  of  vertices  with  degree  at  least 
/  decreases  roughly  as  n~').  Although  these  results  were  based  on  experiments 
and  various  random  models,  they  raise  an  interest  in  investigating  hardness  and 
inapproximability  of  optimization  problems  on  power-law  graphs. 

Recently,  Ferrante  et  al.  [35]  had  an  initial  attempt  on  power-law  graphs  to  show 
the  A/P-hardness  of  Maximum  Clique  (Clique)  and  Minimum  Graph  Coloring 
(Coloring)  {(3  >  1)  by  constructing  a  bipartite  graph  to  embed  a  general  graph  into 
a  power-law  graph  and  A/P-hardness  of  MVC,  MDS  and  Maximum  Independent  Set 
(MIS)  ((3  >  0)  based  on  their  optimal  substructure  properties. 
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CHAPTER  3 

VULNERABILITY  ASSESSMENT 

In  this  chapter,  using  the  well-known  random  power-law  graph  model  (SRPL)  in 
[2],  we  did  an  in-depth  analysis  using  probability  theory  with  respect  to  different  kinds 
of  threats:  random  failures,  preferential  attacks  and  degree-centrality  attacks.  Our 
significant  conclusions  are  (1)  A  complex  network  can  tolerate  random  failures  if  its 
exponential  factor  is  less  than  2.9,  (2)  Power-law  networks  are  more  robust  under 
preferential  node  attacks  and  degree-centrality  node  attacks  when  they  have  smaller 
exponential  factor  (3,  and  (3)  In  order  to  maintain  a  reliable  complex  system,  we  optimize 
the  power-law  networks  by  investigating  on  the  optimal  range  of  exponential  factor 
/3  beforehand.  For  both  communication  networks  and  social  networks,  the  best  /3  is 
illustrated  to  be  lying  in  the  interval  [1.8, 2.5],  which  gives  a  decent  explanation  to  the 
structures  of  real-world  networks  [4,  12,  34,  74].  When  (3  <  1.8,  the  maintenance  of 
network  is  very  costly,  and  when  (3  >  2.5,  the  network  vulnerability  is  unpredictable  due 
to  its  dependence  on  the  specific  attacking  strategy.  (3)  When  cascading  failures  occur, 
power-law  networks  become  extremely  vulnerable  when  the  failures  can  be  propagated 
more  than  2  hops. 

3.1  Metric 

One  of  the  most  crucial  question  is  which  measure  copes  with  the  network 
vulnerability  the  best?  There  have  been  many  studies  proposing  different  metrics 
to  account  for  the  network  vulnerability  [3,  5,  60,  63],  among  which  the  degree  of 
suspected  nodes  or  edges  [5],  the  average  shortest  path  length  [3],  the  global  clustering 
coefficients  [60],  the  available  number  of  compromised  s  -  t  flows  [63],  the  diameters, 
the  relative  size  of  the  largest  cluster  and  the  average  size  of  the  isolated  clusters  [5] 
appear  to  be  the  most  popular  and  effective.  Unfortunately,  these  mentioned  measures 
do  not  seem  to  cast  well  for  some  particular  kinds  of  network  vulnerabilities,  especially 
when  network  fragmentation  is  of  high  priority,  as  depicted  in  Figure  1. 


53 


Let  us  consider  a  simple  example  in  Figure  3-1  illustrating  a  small  portion  of  the 
Internet,  where  nodes  vlt  v2, ...  ,v7  are  ISPs  and  the  rest  are  consumers  or  transmission 
nodes.  As  revealed  in  this  figure,  any  successful  corruptive  attacks  to  nodes  v8  and  vi0 
are  sufficient  to  bring  the  whole  network  down  to  its  knees  with  no  satisfied  customers. 
In  a  different  attacking  strategy,  the  removal  of  node  v7  or  v9,  if  the  adversary  was  to 
use  maximum  degree  centrality  as  the  metric,  does  not  appear  to  harm  the  network 
function  because  all  customers  are  still  satisfied.  These  removals  also  reduce  the 
global  clustering  coefficients  to  0  and  increase  the  average  shortest  path  to  nearly  3. 
Besides,  if  the  attacker  uses  the  available  number  of  compromised  flows  from  v±  to  v2, 
the  destructions  of  nodes  v4  and  v7  will  drop  the  flow  to  1,  and  they  still,  unfortunately, 
cannot  destroy  the  existence  of  the  giant  ISP  component  providing  services  to  the 
(almost)  whole  network. 

Vi  V4  V8  Vn  V14  V17 


Figure  3-1.  An  Example  of  Internet:  the  removal  of  v8  and  vi0  (grey  nodes)  is  sufficient  to 
destroy  the  function  of  the  whole  network  such  that  only  less  than  40% 
nodes  connect  each  other. 

This  example  illustrates  an  important  point  that  the  other  metrics  are  lack  of:  In 
order  to  break  down  the  network,  we  need  to  somehow  control  the  balance  among 
disconnected  components  while  ensuring  the  nonexistence  of  giant  components.  One 
possible  and  effective  way  to  do  so  is  to  measure  the  total  pairwise  connectivity  (P), 
i.e.  the  number  of  connected  node-pairs  [27]  in  the  network.  Back  to  our  example, 
a  scrutiny  look  into  the  destructions  of  nodes  v8  and  vi0,  which  we  know  can  break 
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down  the  network  function,  reveals  that  they,  indeed,  reduce  the  network  total  pairwise 
connectivity  to  its  greatest  extent  (more  than  60%).  This  great  reduction,  as  a  result, 
significantly  malfunctions  the  whole  network.  The  measure  P  also  lends  itself  effectively 
a  lot  of  practical  network  applications.  As  we  discussed  above,  since  many  large-scale 
networks  have  been  shown  to  be  power-law  networks,  the  removal  of  critical  nodes 
and  links  regarding  this  metric  not  only  reduces  the  network  performance  but  also 
can  possibly  disconnect  those  networks  from  the  outside  world.  Another  application 
of  this  metric  can  be  found  in  destroying  terrorist  networks,  e.g.  to  breakdown  the 
communication  between  any  two  terrorist  individuals  to  the  greatest  extent,  as  well  as 
protecting  the  functionality  in  communication  networks. 

3.2  Threat  Taxonomy  and  Notations 

In  the  rest  of  this  chapter,  we  focus  on  investigating  the  vulnerability  of  power-law 
networks  under  random  failures  or  intentional  attacks.  This  section  consists  of  the 
following  parts:  (1)  threat  taxonomy,  including  random  failures  and  intentional  attacks, 
and  (2)  useful  notations. 

3.2.1  Threat  Taxonomy 

In  this  paper,  we  focus  on  investigating  the  robustness  of  power-law  networks 
under  random  failure  and  two  types  of  intentional  attacks,  i.e.  preferential  attack  and 
degree-centrality  attack. 

Definition  19  (Random  Failure).  Each  node  in  G(Qi/3)  fails  randomly  with  the  same 
probability. 

Definition  20  (Preferential  Attack).  Each  node  in  G(ai/3)  is  attacked  with  higher  probabili¬ 
ty  if  it  has  a  higher  degree. 

Definition  21  (Degree-Centrality  Attack).  The  adversary  only  attacks  the  set  of  degree- 
centrality  nodes  in  Gln  Ji). 

Definition  22  (Random  Cascading  Failures).  Each  node  in  G(ai/3)  fails  randomly  with  the 
same  probability  and  the  failures  can  be  cascaded. 
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3.2.2  Notation  Explanation 

With  respect  to  each  threat,  we  define  the  residual  networks  of  the  power-law 
network  G(q,i/3)  as  Gr,  Gp  and  Gc  after  the  occurrence  of  random  failure,  preferential 
attack  and  degree-centrality  attack  respectively.  Their  corresponding  expected  degree 
sequences  are  denoted  as  dn  dp  and  dc,  where  the  number  of  d(,  dj3  and  c/,c  are  referred 
to  as  y(,  yf  and  yf. 

In  addition,  we  define  a  power-law  network  under  certain  threats  to  be  highly- 
connected  if  a.s.  its  pairwise  connectivity  P  =  0(/i2)  and  lowly-connected  otherwise. 

3.3  Preliminaries 

In  this  section,  we  first  present  some  useful  results  in  the  literature,  which  illustrate 
the  important  relations  between  the  size  of  largest  connected  components  and 
the  degree  sequence  in  random  networks.  Based  on  them,  we  then  derive  some 
fundamental  results  to  evaluate  the  robustness  of  power-law  networks.  In  this  paper, 
the  size  of  a  connected  component  S  c  G  is  the  total  number  of  nodes  in  S  and  the 
connected  component  S  is  called  giant  component  if  its  size  is  0(n). 

3.3.1  Previous  Works 

Lemma  14  (M.  Molloy  and  B.  Reed  [68]).  In  a  random  graph  G  with  X,n  nodes  of  degree 
i  where  J2t=i  =  1  for  the  maximum  degree  A, 

n 

Q  =  J2'('-2)\;  (3-1) 

/'= 1 

is  a  metric  which  can  be  applied  to  determine  whether  there  is  giant  components  in  G. 
The  giant  components  exist  if  Q  >  0  and  A  <  n1/A  -  e.  Otherwise,  there  is  a.s.  no  giant 
component  if  Q  <  0  and  A  <  n1/8  -  e. 

Lemma  15  (F.  Chung  et  at.  [21]).  In  a  random  graph  G  with  degree  sequence  d  = 

(c/i,  d2 _ dn),  the  giant  component  a.s.  exists  if  its  expected  average  degree  d  is  at 

least  1,  and  there  is  a.s.  no  giant  component  if  its  expected  second-order  average 
degree  d  is  at  most  1.  Furthermore,  all  connected  components  have  volume  (the  sum 
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of  degrees  in  a  connected  component)  at  most  y/h  log  n  with  probability  at  least  1  -  o(  1) 
ifd  <  1.  Here  the  expected  average  degree  d  and  second-order  average  degree  d  are 
defined  as 


(3-2) 


where  d,  is  the  elements  in  the  degree  sequence. 

Corollary  8.  All  connected  components  a.s.  have  sizes  at  most \yfr\  log  n  +  1  if  d  <  1. 

Proof.  Consider  a  connected  component  S,  the  volume  of  S  is  defined  as  Vol(S)  = 

^2v.eS  dj.  Since  there  are  at  least  |S|  -  1  edges  in  a  connected  component  of  size  |S|, 
we  have  2(|S|  -  1)  <  Vol(S)  <  yfn  log  n.  Therefore,  the  size  of  S  is  upper  bounded  by 

\yj~h  log  n  +  1.  □ 

3.3.2  Robustness  of  Intact  Power-law  Networks 

Theorem  3.1 .  For  a  power-law  network  represented  as  a  (a,  fd)  graph 

•  If  fd  <  3.47875,  the  pairwise  connectivity  P  is  Q(n2); 

•  If  fd  >  3.47875,  the  range  of  pairwise  connectivity  P  is  a.s.  at  most \n  (c(fd)iH  log  n  -  1 


where  c(fd)  =  16/  ((fd)  (2  -  is  a  constant  on  any  given  fd. 


To  prove  Theorem  3.1 ,  we  first  show  the  relation  between  the  largest  component 
and  our  metric,  the  total  pairwise  connectivity,  in  the  following  lemma. 

Lemma  16.  Suppose  that  the  maximum  size  of  a  connected  component  in  the  graph 
G  =  (V ,  E)  is  t,  the  pairwise  connectivity  ¥  is  then  at  most  n(k21> . 

Proof.  To  prove  the  upper  bound,  we  consider  the  worst  case  that  the  whole  network 
consists  of  all  connected  components  of  size  £  except  some  leftover  nodes.  Suppose 
that  there  are  cx  connect  components  of  size  l  and  the  number  of  leftover  nodes  is  c2, 
we  have  n  =  cx£  +  c2.  Therefore,  the  pairwise  connectivity  P  is 


P  <  cx 


C\£  +  c2 


n(l  -  1) 


2 


□ 
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Proof  of  Theorem  3.1 : 


Proof.  First,  according  to  F.  Chung  et  al.  [2],  we  can  find  the  threshold  3.47875  of  (3 
such  that  Q  >  0  when  /3  <  3.47875  and  Q  <  0  when  (3  >  3.47875. 

When  (3  <  3.47875,  according  to  Lemma  14,  since  Q  >  0,  there  exists  one  giant 
component  of  size  0(/i).  Therefore,  the  pairwise  connectivity  P  is  0(/i2). 

When  f3  >  3.47875,  according  to  Aiello  et  al.  [2],  a  connected  component  S  in  the 
(a,  f3)  graph  a.s.  has  the  size  at  most  c((3)nT>  log  n.  Then  the  upper  bound  of  P  follows 
directly  from  Lemma  16.  □ 

In  the  following  three  sections,  since  the  power-law  networks  with  [3  at  least  3.47875 
are  lowly-connected  even  if  they  are  not  attacked,  we  will  focus  on  exploiting  the 
robustness  of  power-law  networks  with  [3  less  than  3.47875  under  random  failures, 
preferential  attacks  and  degree-centrality  attacks  respectively. 


3.4  Random  Failures 


In  this  section,  we  focus  on  the  robustness  of  power-law  networks  after  random 
failures,  in  which  each  node  has  the  same  probability  p  (0  <  p  <  1)  to  fail.  The  total 
pairwise  connectivity  P  in  the  residual  graph  Gr  is  proven  as  in  the  following  Theorem 
3.6.  Based  on  this  theorem,  we  further  investigate  the  good  range  of  exponential  factor 

/3. 

3.4.1  Robustness  under  Random  Failures 

Theorem  3.2.  In  a  residual  graph  G,  of  G{oJj]  after  random  failures, 

•  If  (3  <  f3p,  the  expected  pairwise  connectivity  E (P)  is  a.s.  0(n2); 

•  If  (3  >  /3P,  the  pairwise  connectivity  ¥  is  a.s.  at  most \n  ( cr((3)n f  log  n  -  l) . 
where  (3P  satisfies  that  (1  -  p)((/3p  -  2)  -  (2  -  p)((/3p  -  1)  =  0  and 
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To  prove  Theorem  3.6,  we  first  show  the  expected  degree  distribution  in  Gr  as 


follows. 

Lemma  17.  The  expected  degree  distribution  of  graph  Gr  is 


E(y[) 


where  degree  i  is  1  <  /  <  A. 


(i  -  p)i+1  E 


Proof.  Let  p'k  be  the  probability  that  a  node  v  of  degree  k  in  G(ai/3)  has  its  degree  to  be  / 
in  Gr.  When  k  <  i,  it  is  clear  that  p‘k  —  0;  otherwise  when  k  >  i,  v  will  become  a  node  of 
of  degree  /  in  Gr  if  and  only  if  v  itself  does  not  fail  but  /c  —  /'of  its  neighbors  fail.  Hence, 
the  probability  p'k  is  (^)(1  -  p)[pk~'{  1  -  p)'],  i.e.  1  -  p)/+1. 

Thus,  according  to  the  basic  definition  of  expected  value,  the  expected  number  of 
nodes  of  degree  /  in  Gr  is 


A 

E(y,0  =  Ep* 


e° 

F 


k= 1 


(i  -  P)i+1  E 


□ 


Proof  of  Theorem  3.6: 


Proof.  First  of  all,  we  show  that  Lemma  15  cannot  be  applied  here.  Consider  the 
expected  second-order  average  degree  dr  of  Gr,  we  have 


d 


p  +  (l-p) 


C(/3  -  2) 
CF-i) 


It  is  easy  to  see  that  d  >  1  for  any  p  and  f3. 

In  an  alternative  way,  we  use  Lemma  14  and  branching  process  method  to  prove 
our  theorem.  The  basic  idea  is  as  follows:  according  to  the  expected  degree  of  Gr, 
we  first  find  a  threshold  /3P  using  Lemma  14,  which  determines  whether  the  total 
pairwise  connectivity  P  of  the  residual  network  Gr  is  a.s.  0(n2)  or  not.  If  not,  that  is, 
f3  >  f3p,  we  further  use  branching  process  method  to  prove  that  P  in  Gr  is  a.s.  at  most 
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\n  (^cr(f3)np  log  n  -  lj.  First,  we  compute  /3P  for  Gr  as 

Qr  =  fl '('  -  2)(1  -  p)'+1  £  (*)  pp‘-'  (3-3a) 

/=1  k=i  '  ' 

=  e«(l  -  p)  £  i  £_,'(/  -  2)  (')  P'-J(l  -  P)1  (3-3b) 

/= 1  J  =  1 

=  e°(l-p)^x:|2(1-p)|.;'(2-p)  (3-3C) 

;=i 

=  eQ(l  -  p)2  [(1  -  p)C(/3  -  2)  -  (2  -  p) C(/3  -  1)]  (3— 3d) 

where  step  (3-3c)  follows  similarly  from  the  calculation  of  expected  value  and  variance 
in  binomial  distribution. 

Let  us  consider  the  case  that  the  threshold  /3P  satisfies  (1  -  p) C(/3  -  2)  -  (2  -  p)C(/3  - 
1)  =  o.  When  (3  <  (3P,  we  have  Qr  >  0.  Thus,  the  expected  pairwise  connectivity  £(P)  is 
a.s.  0(n2)  according  to  Lemma  14. 


Algorithm  6:  Branching  Process  Method 


i  i  <-  0; 


2  E0  =  L0  =  {v}  by  picking  an  arbitrary  node  v; 

3  while  \  Lj\  /  0  do 


4 

5 

6 
7 


/  i —  /  T  lj 

Choose  an  arbitrary  u  from  i  and  expose  all  its  neighbors  A/(tv); 
Ej  —  Ej-i  U  /V(u); 

Li  =  (Li  \  ({“})  u  (A/(u)  \  £,_!); 


8  end 


When  f3  >  f3p,  we  use  the  following  branching  process  method  (Algorithm  6)  on 
Gr  according  to  its  expected  degree  sequence  E(y,r).  In  the  algorithm,  we  define  £ 
and  L,  as  the  set  of  exposed  nodes  and  live  nodes  in  iteration  i  respectively,  where  live 
nodes  are  referred  to  as  the  subset  of  exposed  nodes  whose  neighbors  have  not  been 
exposed.  Note  that  |L,|  =  0  if  and  only  if  the  entire  component  is  exposed.  For  simplicity, 
we  define  random  variables  £,  =  \E,  \  and  C,  =  \L,  \  as  the  number  of  exposed  nodes  and 
live  nodes.  Let  T  denote  the  whole  number  of  iterations  in  branching  process,  that  is,  T 
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also  measures  the  size  of  connected  component  since  exactly  one  node  is  exposed  in 
each  iteration.  We  further  define  an  edge  to  be  a  “backedge”  if  it  connects  u  and  some 
node  in  £,_ i.  We  denote  D,  =  \N(u)\  and  B ,  =  \N(u)  n  Ef_i|  -  1  measuring  the  degree 
of  the  node  exposed  in  iteration  /  and  the  number  of  “backedge”.  By  definition,  we  have 

£,  -  Li- 1  =  D,  -  Bj  -  2  immediately. 

Then,  we  calculate  E(D(),  E(B,)  and  E(£()  respectively.  Consider  one  edge  in 
original  graph  G{a^.  It  still  exists  iff  both  endpoints  are  not  failed,  that  is,  the  expected 
number  of  edges  in  Gr  is  (1  -  p)2m.  Therefore, 


A 

E(Df)  =J2' 

i=  1 


:i(l-p)i+ 

(1  —  p)2m 


C(/5  -  1) 


E 

(=i 


/2(  1  -  P)  +  ip 

if> 


Since  \N(u)  n  E,_ i|  >  1,  we  have  E(B,)  >  0.  By  substituting  E(D,)  and  E(B,)  into 

-  Lj- 1  =  Dj  -  B i  -  2,  we  have 


E(A)  =£l  +  EOC; 

J=2 

i 

=  d„  +  ^E(D,-e,-2) 

<*  +  (/-l)((l-p)gz|  +  p-2) 

=  d0  -  A (p,/3)(/  -  1) 


where  A(p,  1)  =  2  -  p  -  (1  -  p)^fzfy  and  the  initial  node  is  assumed  to  have  degree  d0. 
Since  | Lj  -  Lj- 1|  <  A  =  ef ,  according  to  Azuma  Martingale  Inequality  [22], 

-r2 

Pr  [|£,  -  E(£,)|  >  71  <  2e^ 
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2a  2 

where  /  =  (A(p16j))2eT  log  n  =  cr(f3)n p  log  n  and  T  =  A (p,  /?)/ /2.  Since  we  know 


E(£,)  +  T-  <  d0  —  X  (p,  /?)(/  —  1)  H — ^  2^ '  <  ^ 


for  any  of0.  Therefore, 


<  Pr [Ci  >  0]  <  Pr[£,  >  E(£,)  +  T)\ 


Thus,  the  probability  that,  in  graph  Gr,  there  is  a  non-failure  node  >/  belonging  to  a 


2 

connected  component  of  size  larger  than  cr(p)nn  log  n  is  at  most  n  Jr  =  o(l),  i.e.  Gr  has 
the  largest  connected  component  of  size  a.s.  cr(/3)n I  log  n.  Hence,  the  upper  bound  of 


pairwise  connectivity  in  Gr  follows  from  Lemma  16  directly. 


□ 


3.4.2  Good  Range  of  (3  under  Random  Failures 

According  to  Theorem  3.6,  we  exploit  the  good  range  of  exponential  factor  6  in 
terms  of  the  pairwise  connectivity  P  of  power-law  networks.  By  obtaining  the  threshold 
Pp  from  (1  -  p)C(/3P  -  2)  -  (2  -  p)((/3p  -  1)  =  0,  the  relation  between  threshold  pp  and 
failure  probability  p  can  be  revealed  in  the  following  Fig.  3-2. 


^  3.45 


0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8 

Random  Failure  Probability  p 


Figure  3-2.  Relation  between  Threshold  and  Failure  Probability  p 
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Based  on  Theorem  3.6,  power-law  networks  are  highly-connected  when  f3  >  f3p 
and  lowly-connected  otherwise.  As  one  can  see  from  Fig.  3-2,  power-law  networks  of 
exponential  factor  (3  >  2.9  will  still  remain  highly-connected  under  random  failures  even 
when  the  failure  probability  p  is  unrealistically  0.8.  That  is,  we  can  confidently  claim 
that  power-law  networks  have  an  extremely  high  tolerance  to  random  failures  when  its 
exponential  factor  (3  <  2.9. 

3.5  Preferential  Attacks 

As  power-law  networks  are  tolerable  to  random  failures,  one  will  question  whether 
it  can  still  tolerate  intentional  attacks  if  the  intruders  intend  more  to  attack  “hub”  nodes. 
In  this  section,  we  focus  on  the  robustness  of  power-law  networks  under  preferential 
attacks.  As  we  defined  above,  in  preferential  attacks,  each  node  in  the  network  is 
attacked  with  higher  probability  if  it  has  larger  degree.  Therefore,  consider  the  costs  to 
attack  for  intruders,  we  focus  on  the  following  two  preferential  attack  schemes: 

•  Interactive  Preferential  Attacks:  one  way  to  control  the  costs  to  attack  is  to  attack 
a  node  w.r.t.  its  degree  and  a  new  parameter  /31.  That  is,  a  node  of  degree  /  is 
attacked  with  probability  1  -  -jy; 

•  Expected  Preferential  Attacks:  another  way  to  control  the  costs  to  attack  is 
based  on  the  expected  number  of  nodes  c  to  attack.  When  the  intruder  decides 
c,  ranging  between  0  and  ea(((3),  a  node  of  degree  /  is  attacked  with  probability 
Pi  =  cgC,c('3_1)  since  the  expected  number  of  failure  nodes  is  equal  to  c,  namely 

£,■  eiPi  = 

As  one  can  see,  in  both  these  schemes,  a  node  of  higher  degree,  often  referred  to  as 
a  “hub”,  is  more  preferentially  attacked,  that  is,  it  has  higher  probability  to  be  attacked. 
By  denoting  their  corresponding  residual  graphs  as  G 'p  and  Gp£,  their  total  pairwise 
connectivity  are  proven  in  Theorem  3.3  and  Theorem  3.4  respectively. 

3.5.1  Interactive  Preferential  Attacks  (p,  =  1  -  j 

Theorem  3.3.  In  a  residual  graph  G'p  of  G(„i3,  after  interactive  preferential  attacks, 

•  If  (3  +  (3'  <  3.47875,  the  expected  pairwise  connectivity  E (P)  is  0(n2); 
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•  If  /?+/?'  >  3.47875,  the  pairwise  connectivity  P  is  a.s.  at  most \n  [c{f3)np  log  n  -  lj . 
where  c(/3)  =  16/  ((/3)  ( 2  -  c(/3~2) 


C(4-l) 


is  a  constant  on  any  given  (3. 

Theorem  3.3  can  be  proven  in  the  same  way  as  in  Theorem  3.1  after  showing  the 
expected  degree  in  residual  graph  G^  as  in  Lemma  20,  which  is  based  on  the  following 
two  lemmas. 

Lemma  18.  In  graph  G(ai/3),  the  probability  that  a  node  v  of  degree  i  incident  to  another 
node  u  of  degree  x  is  gac('^1). 

Proof.  Consider  a  node  v  of  degree  /,  in  the  matching  of  mini-nodes,  at  least  one  of  / 
mini-nodes  for  v  connects  to  another  one  of  x  for  node  u  of  degree  x.  Thus,  we  have 


(l)Of(W-  2)  ix 


IX 


f(N) 


ix  1 

N  -  1  =  N  +  ^  AG  =  e«C(/3  -  1) 


where  f(/i)  =  (n  -  1)1!  representing  the  number  of  perfect  matchings  for  N  nodes  and 
N  —  ea((/3  -  1)  denotes  the  number  of  mini-nodes.  □ 


Lemma  19.  For  a  node  v  of  degree  i,  the  expected  number  of  non-failure  neighbors 

E(Np(i))  ofvisi«M~»: 

Proof.  According  to  Lemma  18,  node  v  has  probability  ee,c(f^_1)  to  connect  to  node  u  of 
degree  x.  Since  node  u  of  degree  x  has  the  non-failure  probability  -±r,  then  we  have  the 
expected  non-failure  neighbor  of  v  to  be 


A 

E(«;(/))  = 

X=1 


ix  1  e“ 
ea(  (/3  —  1)  xA'  xA 


=  Theproofiscompiete.  □ 

Lemma  20.  The  expected  degree  distribution  of  graph  G'p  is 


E(y,p) 


e 


OL 


C09+/3'-l) 


P+P 


where  i  e 


f  gp+p'-i) 


A 


gp+p-i)  1 
C(A-i)  /■ 
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Proof.  Consider  the  set  of  nodes  with  degree  /  in  Gp,  they  are  correspondent  to  the 


nodes  of  degree  x  in  the  original  graph  G(a  /3).  Hence,  the  expected  unattacked  nodes  in 


this  set  is  =  -jy.  According  to  Lemma  19,  we  know  the  relation  between  /  and  x 
is  '  =  x^if.  Therefore,  we  have  the  expected  number  of  nodes  of  degree  /  in  G^  to 


be 


<(/3+/3'-l) 


/3+/3'  ‘ 


□ 


3.5.2  Expected  Preferential  Attacks  (p,  =  c^^yj 

Theorem  3.4.  In  a  residual  graph  GpE  of  G(a  3)  after  expected  preferential  attacks, 


The  pairwise  connectivity  ¥  is  a.s.  0(/i2) 

rc§3l) )  (xj1- e°(S-l)2)) 


if  c  <  min  <  c 


E. 


*  ^ 


i-- 


>  1 


•  The  pairwise  connectivity  P  is  a.s.  at  most \n I  log  n 


if  c  >  max 


c 


cC(/3— 2)  \ 
e“C(/3-l)2/ 


To  prove  Theorem  3.4,  we  again  first  show  the  expected  degree  distribution  in  GE 
as  follows. 

Lemma  21 .  For  a  node  v  of  degree  i,  the  expected  number  of  non-failure  neighbors 

E(/VE(/))  ofv/s/(l— 

Proof.  According  to  Lemma  18,  the  node  v  has  probability  to  connect  node  u  of 

degree  x.  Since  node  u  of  degree  x  has  the  non-failure  probability  1  -  c  eay_iy  then  we 
have  the  expected  non-failure  neighbor  of  v  to  be 


E(W„(0) 


=  E 


ix 

ea({P  -  1) 


cx  \ 

ea((P~  1)/ 


eg?  -  2)  \ 
e“C(/3-l  Y) 


The  proof  is  complete. 


□ 


Corollary  9.  The  expected  degree  distribution  of  graph  GE  is 


E(yH  = 


—  c^~2)  Y 11 _ y _ 

//3V  e«ap-iy)  \  (e*C09  -  D)  (i  - 
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where  i  e 


cgp-2)  \ 
e“C(/9-l  )2/ 


_cO0-2)_\\ 

e“C(/3-l)27  /' 


Proof  of  Theorem  3.4: 


Proof.  In  the  proof,  we  first  calculate  the  expected  average  degree  yEp  based  on 
Corollary  9  as 


EA  eQ 
x=l 


e“C(/3— 1) 


X  1  - 


cgp-2) 

e“C(/3-l): 


n  —  c 


and  second-order  average  degree  dE  as 


EA 

X=1  xP 


EA  ea 
X=1  xP 


e“C03— 1) 


e“C(/3— 1) 


X 


cg/3-2) 

e“C(/3-l)2 


X 


cg/3-2) 

e“C(/9-l)2 


/  c«/3  -  2)  \  CW  - 2)  -  0O0L 

\  e-CW  -  Dd  C(fi  -  1)  - 


2 


According  to  Lemma  15  and  Corollary  8,  there  exists  one  giant  component 
and  all  components  have  size  at  most  \s/n  log  n  +  1  if  ypE  <  1,  then  the  proof  follows  from 
Lemma  16  directly.  □ 


3.5.3  Relations  between  f3  and  Expected  Attacked  Nodes 

In  interactive  preferential  attacks,  according  to  Theorem  3.3,  a  power-law  networks 
with  exponential  factor  /3  will  be  lowly-connected  if  the  intruder  select  a  /3'  such  that 
(3  +  f3'  >  3.47875.  Since  a  node  of  degree  /  is  attacked  with  probability  1  -  in  this 
scheme,  this  node  can  survive  with  probability  jp-.  Therefore,  we  have  the  expected 
number  of  survived  nodes  as 


=  e“c(/3 + p,) 
i 

that  is,  the  expected  percentage  of  attacked  nodes  can  be  obtained  by  calculating 

_  W) 

1  C(/3)  ■ 

Fig.  3-3  reports  the  relation  between  (3  and  expected  attacked  nodes  under 
iterative  preferential  attacks.  We  observed  that  the  expected  number  of  attacked  nodes 
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Figure  3-3.  Relation  between  (3  and  Attacked  Nodes  under  Iterative  Preferential  Attacks 


decreases  sharply  with  the  increase  of  /3.  Clearly,  smaller  f3  leads  to  a  more  robust 
power- 1  aw  network. 


Figure  3-4.  Relation  between  /3  and  Attacked  Nodes  under  Expected  Preferential 
Attacks 


In  expected  preferential  attacks,  again  Fig.  3-4  reveals  the  smaller  {3  the  better. 
According  to  Theorem  3.4,  except  of  uncertain  areas  (shadow  areas),  we  can  see  that 
the  percentage  of  attacked  nodes  (under  the  red  line)  reduces  when  f3  increases. 

3.6  Degree-Centrality  Attacks 

As  power-law  networks  is  quite  vulnerable  under  preferential  attacks,  their  toleration 
to  the  deterministic  intentional  attacks  attracts  more  attentions.  Also,  one  can  also 
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question  whether  it  is  still  true  under  deterministic  intentional  attacks  that  power-law 
networks  with  smaller  (3  can  better  maintain  their  functionalities.  In  this  section,  we 
consider  the  degree-centrality  attack,  in  which  the  intruders  intentionally  attack  the 
“hubs”,  the  set  of  nodes  of  highest  degrees.  When  all  nodes  of  degree  larger  than  x0  are 
attacked  simultaneously,  we  have  the  following  Theorem  3.5. 

3.6.1  Robustness  under  Degree-Centrality  Attacks 

Theorem  3.5.  In  a  residual  graph  Gc  of  G^)  after  degree-centrality  attacks, 

•  The  pairwise  connectivity  P  is  a.s.  0(n2) 


if  x o  >  min  <  x0 


•  The  pairwise  connectivity  P  is  a.s.  at  most \n i  log  n 


if Xq  <  max  |x0  Ex=i  <  1 /■ 


To  prove  Theorem  3.5,  we  again  first  show  the  expected  degree  distribution  in  Gc  as 
follows. 

Lemma  22.  For  a  node  v  of  degree  i  in  original  graph  G(nJ^,  the  expected  number  of 
neighbors  of  degree  larger  than  x0  is  E?=x0+i 

Proof.  According  to  Lemma  19,  the  probability  that  a  node  v  of  degree  /  incident  to  a 
node  u  of  degree  x  is  eQC('^1).  Therefore,  we  have  the  expected  number  of  neighbors  of 
degree  larger  than  x0  to  be 


X=X0+1 


E 


A 


1 


□ 


Corollary  10.  The  expected  degree  sequence  in  Gc  is 


where!  e{ 


'x=l  xP-1  ’  ■  ’  C03-1) 
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Proof  of  Theorem  3.5: 


Proof.  With  the  expected  average  degree  yc  as 

V^x°  §3L  y  (  1  Y^x°  1 

Z^x=l  xP*  l  C(/3-l)  ^i= 1  iP-1 


dr  = 


(E5=i  ^ 


'  Z^/=x0  +  l  //3 


and  second-order  average  degree  yc  as 


dr  = 


EXO  y  /  1  v^xo  1 

x=l  x/3  X  l  C(/3— 1)  ^-"'=1  1 


C(/5  - 1)  Ex°=i^ 


Xo 


Y^xo  e“Y  /  1  Y^x°  1  A  C(8  ~  1)  X@  2 

Z^x=l  x/5X  ^CC/3-1)  ^'=1  iP-1 )  X=1 

The  rest  of  proof  is  the  same  as  Theorem  3.4. 


□ 


3.6.2  Relations  between  (3  and  Attacked  Nodes 

Fig.  3-5  illustrates  the  relations  between  p  and  attacked  nodes  under  degree-centrality 
attacks  based  on  Theorem  3.5.  On  the  one  hand,  it  is  similar  to  expected  preferential 
attacks  that  the  percentage  of  attacked  nodes  (under  the  red  line)  reduces  when 
P  increases  except  of  uncertain  areas  (shadow  areas).  On  the  other  hand,  under 
degree-centrality  attacks,  the  intruder  only  needs  to  attack  roughly  8%  less  number 
of  nodes  to  lower  down  the  pairwise  connectivity  of  power-law  networks  than  under 
expected  preferential  attacks. 


Figure  3-5.  Relation  between  p  and  Attacked  Nodes  under  Degree-Centrality  Attacks 
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3.7  Random  Cascading  Failures 

More  importantly,  the  failures  will  lead  to  a  much  more  devastating  consequence 
especially  when  failures  are  cascaded,  i.e.,  these  failed  nodes  can  cause  the  overload 
and  failure  of  their  nearby  elements  in  the  system  because  of  the  load  shifting. 

Let  us  consider  an  example  in  Figure  3-6  illustrating  a  small  portion  of  the  power 

grid,  where  nodes  vi,  v2 _ v7  are  generators,  v8,  v9 _ vi6  are  transmitters  and 

v17,  vi8,  vig  are  customers.  Each  node  has  load  equal  to  its  degree  and  capacity  equal 
to  twice  its  degree.  As  revealed  in  this  figure,  any  successful  corruptive  attacks  to  nodes 
\/g  and  vio  can  affect  the  power  supply  from  generators  or  transmitters  instantly,  while 
customers  are  still  able  to  utilize  the  electricity  until  the  left  electricity  in  demand  centers 
is  used  up.  However,  when  failures  are  cascaded,  all  transmitters  can  fail  sequentially 
(gradual  color  changes),  i.e.,  v8,  vi0  v12  vg,  v15  =>  vn,  v13  =>  vi4,  vi6,  leading  to 
no  power  supply  to  customers  instantly.  Therefore,  in  order  to  continuously  maintain  the 
normal  network  functions,  it  is  of  great  importance  to  assess  the  network  vulnerability  in 
the  present  of  cascading  failures,  beforehand. 

In  this  section,  taking  into  account  cascading  failures,  we  analyze  the  network 
vulnerability  via  two  main  thrusts,  complex  network  structure  analysis  and  optimal 
detection  of  most  vulnerable  nodes,  based  on  the  recently  proposed  effective  metric,  to¬ 
tal  pairwise  connectivity  [27,  77,  78].  By  measuring  the  connected  node-pairs  in  residual 
networks  (a  pair  of  nodes  are  connected  when  there  is  a  functional  path  between  them), 
the  minimum  of  total  pairwise  connectivity  can  control  the  balance  among  disconnected 
components  and  further  ensure  the  nonexistence  of  giant  components,  leading  to  the 
destruction  of  network  functionality. 

3.7.1  Cascading  Failure  Model 

In  this  paper,  we  use  one  of  the  most  well-accepted  models  proposed  in  [85],  in 
which  each  node  u  in  the  network  has  a  threshold  9U  g  [0, 1],  typically  drawn  from 
some  probability  distribution.  Starting  with  an  initial  set  of  failure  nodes  F0,  called 
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Figure  3-6.  Each  node  in  this  power  grid  has  load  equal  to  its  degree,  capacity  equal  to 
twice  its  degree  and  each  red  arrow  says  the  shifting  of  2  unit  load.  The  solid 
red  arrows  stand  for  the  direct  failure  caused  by  the  cascades  and  the  dotted 
ones  mean  the  load  shifting  to  the  neighbor  which  is  not  failed  directly.  The 
overload  and  failure  of  v8  and  v10  can  only  cause  the  disconnection  from 
generators  and  transmitters,  yet  the  power  can  be  still  supplied  to  customers 
from  demand  centers.  However,  when  failure  cascades,  it  leads  to  the 
breakdown  of  all  transmitters  and  the  electricity  to  customers  are  affected 
instantly. 


vulnerable  nodes,  the  dynamics  of  failure  cascades  unfold  round  by  round  as  follows. 

The  cascading  process  is  deterministically  in  discrete  rounds:  in  round  t,  all  nodes  that 
failed  in  round  t  —  1  remain  failed,  and  another  node  v  fails  if  the  total  number  of  its 
failure  neighbors  is  at  least  9U,  i.e.,  \N(u)  n  Ft_ i|  >  9udeg(u),  in  which  Ft_ i  is  the  set  of 
failure  nodes  before  round  t  —  1. 

In  addition,  this  model  is  also  exhibited  as  one  of  the  two  main  cascading  models 
in  the  context  of  social  science  literature,  which  is  referred  to  as  Linear  Threshold 
Propagation  model.  More  importantly,  this  model  belongs  to  the  category  of  most 
contagion  problems,  such  as  models  of  failures  in  engineering  systems,  i.e.,  power  grid 
[75],  the  Internet  [5],  or  models  of  epidemics  [53]  and  so  on. 

In  the  literature,  some  of  the  works  assume  that  the  thresholds  6U  are  given 
as  a  part  of  the  input.  However,  the  thresholds  are  usually  given  as  constant  in 
communication  networks  due  to  the  fixed  load  of  each  node.  On  the  other  hand,  they  are 
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generally  not  available  and  non-trivial  to  infer  in  social  networks  [39].  Therefore,  instead 
of  the  random  threshold,  we  use  a  simplified  variation  in  which  a  node  fails  if  a  fraction  9 
of  its  neighbors  failed  in  the  previous  round. 

3.7.2  Cascading  Random  Failures 

In  order  to  analyze  the  pairwise  connectivity  in  the  residual  power-law  graph,  we 
first  provide  the  following  theorem  by  extending  from  Theorem  3.6: 

Theorem  3.6.  Consider  the  residual  graph  G'  with  expected  degree  sequence  d  = 

{d1,d2 _ dn}  (y  —  {ylly2 _ yA'}  represents  the  number  of  elements  with  value  i  in  d) 

and  maximum  degree  A'.  Given  the  following  conditions  w.r.t.  the  bounds  of  first-order 
degree  summation 

n 

dL  ( 6')  <  =  dn  <  „( G')  (3-4) 

i=  1 

and  the  bounds  of  the  second-order  degree  summation 

n  A' 

<&„(<?)  <  E  d'  =  ^  d™(6')  0-5) 

i=l  j=  1 

Then,  we  have  the  pairwise  connectivity  is  a.s.  at  most \n  (cA2'  log  n  —  l).  where 
c  =  — c)116(g;)  .  Note  that  the  bounds  c/Tn(G')  and  d2ax(G')  are  more  important  to  assess 

2~  ™(C') 
minv  ' 

the  pairwise  connectivity  when  the  residual  network  is  fragmented. 


Proof.  Here  we  only  show  the  different  parts  from  the  proof  of  Theorem  3.6.  After 
branching  process  method,  we  again  focus  on  calculating  E(D,),  E(S()  and  E(£,) 
respectively.  Note  that  we  will  just  focus  on  the  different  steps  from  [78]  in  this  proof.  Let 
A  =  2  —  .  We  have, 

^min'  * 


H2  ( C,y\ 

E(D/)  <  Tax^.:  =  2  -  A 


■ 

min 


(GO 


Similar  as  in  [78],  we  have  E(E,)  >  0  due  to  \N(u)  n  E/_1|  >  1.  Then 


E(£/)  <  d0  -  A(/  -  1) 
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is  derive  from  the  substitution  of  E(D,)  and  E(B,)  into  £,  -  £,_  1  =  D,  -  B,  -  2,  where  the 
initial  node  is  assumed  to  have  degree  d0. 

Since  the  maximum  degree  in  the  residual  graph  G'  is  A',  we  have  \£j  -  £y_i|  <  A'. 
According  to  Azumals  Martingale  Inequality  [22], 

-n2 

Pr  [\Cj  -  E(£,)|  >  Q]  <  2e ^ 

where  /  =  A§A2/  log  n  —  cA2/  log  n  and  Q  =  |/.  Since  E(£,)  +  ft  <  d0  —  A(/  —  1)  +  |/  <  0 

for  any  d0,  we  have 

Pr[T  >  2e^l  <  — 

L  J  n2 

Then,  the  probability  that  there  is  a  non-failure  node  belonging  to  a  connected 
component  of  size  larger  than  cA2/  log  n  in  graph  G'  is  at  most  o(l)  and  the  upper  bound 
pairwise  connectivity  in  G'  follows  from  Lemma  16  directly.  □ 

In  this  subsection,  we  focus  on  investigating  the  expected  degree  sequence  of  the 
residual  graph,  along  with  its  upper  and  lower  bounds  of  first  and  second  order  degree 
summation. 

Lemma  23.  When  p  >  9,  the  upper  bound  max{Pr^}  of  the  probability  Pr kd  that  a  node  v 
of  degree  k  survives  after  d  >  0  round  cascades  can  be  recursively  computed  using 

1  A 

(1  -  P)(-2  exP  {  -  2kil  -  9  -  0'  ■  max(P/d-i})2}) 

/= 1 

where  O,  =  //3_lc1(/3_1)  is  the  probability  that  one  of  a  neighbor  for  an  arbitrary  node  has  a 
node  of  degree  i  [78],  and  P/0  =  1  -  p  for  any  degree  i  since  each  node  randomly  fails 
with  the  same  probability  p  at  the  beginning. 
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Proof.  Consider  a  node  v  of  degree  k  and  the  probability  <t>,  that  \/  has  a  neighbor  of 
degree  /.  We  have 

/d 

Pr[v  has  x,  neighbors  of  degree  /]  =  ^ — - 

For  a  neighbor  of  1/  with  degree  /',  it  could  either  fail  in  round  j  with  probability  pu  or 
survive  after  d  rounds  with  probability  q^d_ (=  Pr^i),  that  is,  Ylj=o  Pu  +  <7/(d-i)  =  1-  Let 
fy  be  the  neighbors  of  degree  /  failed  in  round  j  and  s^-i)  be  the  neighbors  of  degree 
/  survived  after  d  round  cascades.  Note  that  the  probabilities  py  and  q^d- 1)  can  be 
derived  from  the  power-law  random  network  model  only  based  on  the  degree  /  of  a  node 
in  a  particular  round  j,  along  with  the  initial  failure  probability  p  of  each  node.  Therefore, 
we  have 


n*’ 


Pr[v  survives  at  round  0  n 

fy  neighbors  of  degree  /  fail  in  round  j  n 

s/(d_ i)  neighbors  of  degree  /  survive  after  round  d  -  1] 

Pr[v  survives  at  round  0  n 

x,  neighbors  of  degree  / 

fj  neighbors  out  of  x,  of  degree  /  fail  in  round  j  n 
s/(d_ i)  neighbors  out  of  x,  of  degree  /  survive  after 
round  d  -  1  |  v  has  x,  neighbors  of  degree  /] 

•Pr[v  has  Xj  neighbors  of  degree  /] 


(i-p)II 


X,! 


ii  vrrd_1  f  I c  J-- 
/=!  llj=l  'y!S/(c/-l)  ;=l 


n^-fTr^rn 

j'=0  '  /=! 


0,X' 


(1  —  P)k\ 


&  d— 1 


n,  a  n,  n**"’8*'’' 


]i(d- 1) 


/=! 


where  the  third  step  holds  since  the  probability  is  equal  to  0  when  there  exists  some 

Xi  +  Ejto1  fJ  +  si(d- 1)-  Also-  ^  is  clear  to  see  that  E?=i  EjS  0/Py  +  Ef=i  =  1- 
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According  to  the  cascading  model,  node  v  survives  after  d  hop  cascades  if  and  only  if 
less  than  9  fraction  of  its  neighbors  fail  after  d  -  1  round  cascades.  Therefore,  we  have 


Prk 

r  d 


E 


(1  —  p)k\ 


sr' a  sr^d-1  r  <f)k  ri/  Uj  ■  ri/  E(c/ 1) 
E,=i  E/=o 


A  d—l  A 

(=1  j= 0 


/= 1 


,(d-l) 


/c 


,T/' 


(1  P)  I  xr^d-1  r 

Eti  Ejto1  h<ek  V2"'=1  ^=0  ,J 
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where  the  third  step  follows  from  the  Hoeffding's  inequality  [76]  and  the  last  step  follows 
from  maxIPr^}  <  1  -  p  since  the  probability  of  the  survival  of  a  node  has  to  be  smaller 
than  the  probability  it  fails  without  cascading  failures.  □ 


Lemma  24.  When  p  >  9,  the  lower  bound  min {Pi^}  of  the  probability  Pr kd  that  a  node  v 
of  degree  k  survives  after  d  >  0  round  cascades  can  be  recursively  computed  using 

(1  "  P){ek )  t1  -  X>  '  ™n{fVi »"‘(E<t>/  ■  min{P<,_1})(1-e)* 

x  7  ,=1  /= i 

where  <t>,  is  the  probability  that  a  node  of  degree  k  has  a  neighbor  of  degree  i,  and 
PhQ  =  l  -  p  for  any  degree  i  since  each  node  randomly  fails  with  the  same  probability  p 
at  the  beginning. 
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Proof.  According  to  the  proof  of  Lemma  23,  we  know  that 

Pr$  =  (1-p)  E  (  a  L-1  ) 

Eti  J2j=o  fij<ek  ^'_1  ^ J~°  'J' 

£  ^  <t>iPii)^j  f«(i  0' pu )*-E,Ej' 

/=1  j=0  /=1  j=0 

> (1  -  p)  (L)  (i  -  E  'M**-!))'1"'”' 

x  '  /=i  ;=i 

Next,  consider  the  function  f(x)  =  (1  —  x)yxk~y  for  some  0  <  x  <  1  and  0  <  y  <  k. 
We  have 

=  (i  _  x)y~1xk~y~1(k  —  kx  —  y) 
ux. 

It  is  easy  to  see  that  >  0  iff  k  -  kx  -  y  >  0.  That  is,  x  <  1  -  9  when  y  =  9k.  Since 
min  Pr^_x  <  qj  <  1  —  p,  we  have  qj  <  1  —  9  for  any  0  <j<d  —  1  when  p  >  9.  P 

Lemma  25.  The  expected  number  of  node  of  degree  k'  =  k  Ylt=i  <T,P^/  in  residual  graph 
can  be  estimated  as  Pfd  where  the  bounds  of  Pr kd  is  determined  by  Lemma  23  and  24. 

Proof.  Consider  a  node  of  degree  k  in  original  graph  G.  After  cascading  failures,  its 
degree  can  be  estimated  based  on  the  survival  of  its  neighbors.  Particularly,  for  each 
neighbor,  it  has  probability  <t>,-  to  connect  to  a  node  of  degree  /.  Moreover,  a  node  of  /  in 
G  will  survive  after  d- round  cascading  failures  with  probability  Pr^.  Therefore,  for  a  node 
of  degree  k  in  G,  its  degree  in  the  residual  graph  can  be  estimated  as  k  Ylt=i  ^/Pr^.  On 
the  other  hand,  each  node  of  degree  k  in  G  has  probability  Pr^  to  survive  after  cascades 
and  there  are  nodes  of  degree  k  in  G.  Therefore,  the  proof  is  complete.  □ 

Using  the  lemma  25,  we  can  obtain  the  following  Theorem: 

Theorem  3.7.  The  expected  first-order  and  second-order  degree  summation  are 

pa  A  A  i 

e  V  \  1  pp  pA 
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Proof.  According  to  the  definition  of  first-order  degree  summation,  we  have 
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Again,  according  to  the  definition  of  second-order  degree  summation,  we  have 


E*'ppr5 
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AAA 
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□ 


3.7.3  Numerical  Analysis 

Here  we  show  that  our  theoretical  analysis  consists  well  with  the  simulation  result. 
Particularly,  we  generate  power-law  networks  using  igraph  package  [26]  and  test  on 
the  synthetic  networks  with  different  parameters,  exponential  factor  d  and  network  size 
n.  Due  to  the  similar  results  using  distinct  parameters,  we  only  provide  the  result  with 
P  =  1.5  and  n  =  250  as  in  Fig.  3-7.  As  revealed  in  Fig.  3-7,  apart  from  the  surprising 
agreement  between  our  analysis  and  simulation,  we  also  find  that  power-law  networks 
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Figure  3-7.  Numerical  Analysis  in  Power-Law  Networks  {/3  —  1.5,  n  —  250).  We  plot  the 
three  cascading  hops  and  find  that  our  analysis  (pink  plots)  approximates 
the  simulation  of  the  total  pairwise  connectivity  (PWC)  after  cascading 
failures  surprisingly  well,  in  both  cases  that  power-law  networks  are  a.s. 
unaffected  (PWCoc  n 2)  and  a.s.  fragmented. 

are  no  longer  robust  under  random  failures  when  cascading  failures  occur.  For  example, 
when  each  node  is  attacked  with  probability  only  0.4,  the  network  is  a.s.  fragmented  only 
after  1-hop  propagation.  This  transition  happens  only  when  the  probability  equal  to  0.2  if 
failures  can  cascade  2  hops  and  almost  vanishes  when  allowing  more  hop  cascades. 

3.8  Related  Works 

There  are  a  great  number  of  studies  regarding  the  tolerance  of  real-world  networks 
against  failures  and  attacks  using  different  metrics.  Edge  vulnerability  in  metabolic 
networks  was  studied  by  Kaiser  et  al.  with  respect  to  the  average  shortest  path  and 
the  clustering  coefficient  [50].  For  the  sake  of  power  grid  networks,  Albert  et  al.  [3] 
investigated  their  vulnerability  by  measuring  the  loss  of  connectivity  under  various 
threats,  including  random,  cascading,  load-based  and  degree-based  nodal  failures.  The 
disruption  of  vital  interstate  systems  was  assessed  by  Matisziw  et  al.  [63]  according 
to  the  available  number  of  compromised  s  -  t  flows.  Cohen  etal.  [23]  showed  the 
resilience  of  Internet  to  the  random  breakdown  of  the  nodes  based  on  percolation 
theory.  In  [72],  Satorras  et  al.  also  revealed  that  the  random  uniform  immunization  of 
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individuals  cannot  lead  to  the  eradication  of  communications  in  complex  social  networks 
using  the  reduced  prevalence  rate.  Doyle  et  al.  [30]  and  Sydney  et  al.  [81],  using  a 
novel  metric  ELASTICITY,  explored  that  Internet  topologies  are  less  affected  by  both 
random  and  targeted  attacks  than  the  power-law  networks.  In  general,  the  robustness 
of  other  complex  networks  was  studied  in  [46,  47]  using  algebraic  connectivity,  i.e.,  the 
second-smallest  eigenvalue  of  the  Laplacian  matrix  of  a  graph.  Recently,  from  a  different 
perspective,  Alderson  et  al.  [7]  focused  on  the  role  of  organization  and  design  in  terms 
of  the  complexity  in  highly  organized  technological  and  biological  systems. 

More  generally,  Albert  et  al.  [5]  first  compared  the  robustness  of  complex  systems 
with  the  power-law  and  exponential  properties.  By  measuring  the  diameters,  the  relative 
size  of  the  largest  cluster  and  the  average  size  of  the  isolated  clusters,  the  power-law 
networks  are  empirically  observed  to  tolerate  failures  to  a  surprising  degree  but  their 
survivability  decreases  rapidly  under  attacks  after  comparing  them  with  exponential 
networks.  Later  on,  Holme  et  al.  [43]  further  investigated  the  degree  of  harms  to 
power-law  networks  under  different  strategies  of  attacks.  Unfortunately,  all  these 
observations  are  derived  from  experiments  and  lack  their  theoretical  foundations. 
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CHAPTER  4 

OPTIMIZATION  OF  POWER-LAW  NETWORKS 
In  this  chapter,  we  investigate  the  tradeoff  impact  of  maintenance  costs  and 
robustness  guarantee  on  the  power-law  networks.  In  particular,  we  focus  on  the 
power-law  networks  with  f3  <  2.9,  which  have  been  discovered  to  tolerate  random 
failures  to  an  extreme  high  degree.  In  addition,  since  Fig.  3-3,  3-4  and  3-5  already 
revealed  that  power-law  networks  can  tolerate  preferential  attacks  if  they  can  tolerate 
degree-centrality  attacks  when  /3  <  2.9,  we  focus  on  the  guarantee  of  their  functionality 
under  degree-centrality  attacks.  We  study  the  practical  communication  networks  and 
social  networks  respectively  to  explore  the  underlying  reasons  of  their  real-world 
network  topologies. 

On  the  other  hand,  we  show  the  NP-hardness  to  detect  these  critical  elements  in 
power-law  networks,  along  with  two  algorithms  in  two  different  cases:  element  failures 
and  cascading  failures.  The  effectiveness  of  our  algorithms  are  evaluated  on  both 
synthetic  power-law  networks  and  real-world  networks. 

4.1  Design  Optimization  of  Power-law  Networks 
The  above  vulnerability  assessments  give  us  a  belief  that  power-law  networks  are 
more  robust  when  f3  is  smaller.  However,  a  majority  of  real-world  networks  usually  have 
their  exponential  factor  /3  ranging  from  2  to  2.5  rather  than  some  small  f3  approaching  1 
or  even  less.  The  questions  are  intuitively  raised:  Is  it  better  if  real-world  networks  are 
denser  such  that  they  can  be  more  robust?  What  causes  them  to  be  sparser  than  our 
expectation?  Does  there  exist  some  potential  optimization  factors? 

To  address  these  questions,  in  this  section,  we  investigate  the  tradeoff  impact  of 
maintenance  costs  and  robustness  guarantee  on  the  power-law  networks.  In  particular, 
we  focus  on  the  power-law  networks  with  j3  <  2.9,  which  have  been  discovered  to 
tolerate  random  failures  to  an  extreme  high  degree.  In  addition,  since  Fig.  3-3,  3-4  and 
3-5  already  revealed  that  power-law  networks  can  tolerate  preferential  attacks  if  they 
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can  tolerate  degree-centrality  attacks  when  (3  <  2.9,  we  focus  on  the  guarantee  of  their 
functionality  under  degree-centrality  attacks.  We  study  the  practical  communication 
networks  and  social  networks  respectively  to  explore  the  underlying  reasons  of  their 
real-world  network  topologies. 

4.1.1  Communication  Networks 

In  the  design  of  communication  networks,  such  as  the  Internet,  telecommunication 
networks  and  so  on,  we  are  required  not  only  to  guarantee  their  functionality  but  reduce 
the  maintenance  costs  as  well.  Among  various  network  performance  metric,  i.e.,  delay, 
packet  loss,  throughput,  etc.,  the  guarantee  of  its  connectivity  is  of  the  high  priority.  That 
is,  a  real-world  network  only  need  sufficient  number  of  links  to  guarantee  its  functionality, 
and  its  other  performance  metric  can  be  guaranteed  by  adjusting  its  capacity  planning 
[59].  In  particular,  we  consider  the  costs  including  the  link  costs  and  the  protection  costs 
for  critical  nodes.  Since  the  nodes  with  degree  and  betweenness  centrality  are  closely 
correlated  in  non-fractal  power-law  networks  [56],  here  we  consider  the  critical  nodes  to 
be  degree-centrality  nodes. 

To  formulate  the  optimization  function  for  power-law  networks  in  communication 
networks,  we  first  prove  the  following  Lemma  26  by  considering  the  worst  case  with 
respect  to  the  robustness  of  power-law  networks.  That  is,  as  mentioned  above,  after 
protecting  the  degree-centrality  nodes,  power-law  networks  a.s.  remains  highly-connected 
(its  total  pairwise  connectivity  is  a.s.  0(/i2))  even  though  all  other  nodes  are  failed. 

Lemma  26.  Let  Gcp  be  the  residual  graph  of  only  consisting  of  the  protected 
degree-centrality  nodes  (the  nodes  of  degree  larger  than  x0),  we  have 

•  The  pairwise  connectivity  P  is  a.s.  Q(n2) 


•  The  pairwise  connectivity  P  is  a.s.  at  most \n i  log  n 
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Proof.  Consider  that  we  protect  only  all  nodes  of  degree  larger  than  x0  and  all  other 
nodes  are  failed.  Similar  as  in  Corollary  10,  the  expected  degree  sequence  can  be 


written  as 


The  rest  of  proof  is  the  same  as  Theorem  3.4. 


n 


In  order  to  guarantee  the  functionality  of  a  power-law  network,  we  take  the  above 
Lemma  26  as  the  condition.  In  the  meanwhile,  we  aim  to  minimize  the  maintenance 
costs,  which  include  the  link  costs  and  critical  node  protection  costs.  In  detail,  we 
consider  the  following  cost  functions: 

•  Link  Costs:  Consider  a  link  (u,  v)  in  G(ai(3),  its  link  cost  is  heavily  dependent 
on  the  number  of  messages  it  transmits  according  to  [62],  Another  crucial 
factor  for  the  link  cost  is  its  capacity  flow  [79].  Since  the  nodes  with  degree  and 
betweenness  centrality  are  closely  correlated  in  non-fractal  power-law  networks 
[56],  we  consider  the  link  cost  proportional  to  the  average  of  the  degrees  of  its  two 
endpoints. 

•  Critical  Node  Protection  Costs:  In  terms  of  the  critical  nodes  in  G(ai/3),  apart  from 
their  degrees,  their  protection  costs  are  also  closely  related  with  the  network 
density.  According  to  [79],  the  costs  will  rise  with  the  increase  of  density  since  it 
enlarges  the  demand  of  message  exchanges.  In  addition,  as  investigated  in  [62], 
the  chain  reaction  leads  to  the  roughly  exponential  increase  of  costs,  we  consider 
the  cost  7(x)  to  protect  a  node  of  degree  x  as  a*  xb/P  for  some  constant  a  and  b. 

Therefore,  we  can  confidently  formulate  the  following  Mixed  Linear  Programming  (MIP), 

with  two  variables  x0  and  /3,  as 


(4-1) 


Xg  £  Z+ ,  Xg  <  A 

p  >  0 


Note  that  we  omit  the  proportional  constant  of  link  cost  since  it  does  not  affect  the 
optimization  of  total  maintenance  costs. 
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4.1.2  Social  Networks 


As  we  mentioned  at  the  beginning  of  this  paper,  one  of  the  main  threats  in  social 
networks  is  the  malware  propagations  [87],  Thus,  apart  from  the  factors  in  [11],  the 
containments  of  these  malicious  spreading  become  another  crucial  factor  of  the 
sparsification  of  social  networks.  In  other  words,  when  an  individual  is  infected,  we 
want  to  minimize  the  expected  number  of  total  infected  users,  which  can  be  realized  by 
immunizing  critical  users  beforehand.  Therefore,  the  minimization  of  immunization  costs 
becomes  an  urgent  need. 

Thus,  in  order  to  formulate  the  optimization  function  for  power-law  networks  in 
social  networks,  we  first  investigate  the  upper  bound  of  expected  size  of  a  connected 
component  after  protecting  the  critical  users,  which  are  again  referred  to  as  the 
degree-centrality  nodes.  That  is,  we  focus  on  the  size  of  connected  components  on 
residual  network  after  removing  such  immunized  users.  By  defining  the  residual  graph 
Gs  to  be  the  residual  power-law  graph  G[V  \  S ]  after  immunizing  individuals  in  S,  the 
following  Theorem  4.1  gives  the  bound  of  expected  size  of  a  connected  component  on 
Gs. 


Theorem  4.1 .  In  the  residual  graph  Gs  of  G{q  3),  the  expected  size  of  a  connect¬ 
ed  component  c  is  a.s.  upper  bounded  by  o(n^)  when  d5  <  1,  that  is,  x0  < 


maxHc(fe£ti^< i}- 


Proof.  Consider  the  connected  components  clt  c2, ... ,  ck  in  Gs,  their  expected  size  c 
can  be  written  as  \  ]T1</<A.  c<-  According  to  [21],  all  connected  components  a.s.  have 
volume  at  most  C'y/h  for  some  constant  C  when  d  <  1.  Therefore,  the  number  of 
connected  components  is  at  least  C^fh  where  C  =  1  / C'.  Supposing  that  c  >  7 in  for 
some  constant  7  with  probability  p,  the  probability  that  any  random  pair  of  nodes  are  in 
the  same  component  can  be  lower  bounded  by 

-7=2 p  Y c'2  ^  ^=^2/c  -  ~^pCl2n 

n2ds  i<i<k  n2ds  n2ds 
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On  the  other  hand,  according  to  F.  Chung  et  al.  [21  ],  we  know  that  the  probability  for  any 
random  pair  of  nodes  belonging  to  the  same  component  is  upper  bounded  by 

% 

(1  -  ds)nds 


Combining  the  above  two  bounds,  we  know 


1  ^  9 
— =2 PC7  n  < 
n2ds 


(1  -  d5)nd5 


which  implies  that 


- d*~ds  ~ 

-  C72(  1  -  ds) 

That  is,  by  choosing  C  to  be  log  n,  with  probability  at  least  1  -  o(l),  the  expected  size  c 
of  connected  components  is  a.s.  at  most  0(  n? ).  □ 


Again,  consider  the  above  lemma  as  the  condition  and  the  same  protection  cost 
function  of  critical  users  7(x)  =  a  *  xb/P,  we  formulate  the  following  mixed  linear 
programming,  with  two  variables  x0  and  /3,  in  order  to  make  sure  that  the  expected  size 
of  connected  components  in  the  residual  power-law  networks  is  no  larger  than  0( /71/4) 


min  E?=xo+iStM 
s-t-  E2=i  ^=2  <  i 

Xq  G  Z.  + ,  x0  <  A 


P  >  0 


4.1.3  Optimal  Range  of  Exponential  Factor  ft 

For  the  sake  of  communication  networks,  consider  the  practical  range  of  protection 
costs  from  0  to  x9/f3  for  a  node  of  degree  x  (that  is  b  e  [0, 3]),  Fig.  4-1  reveals  the 
relation  between  maintenance  costs  and  optimal  (3  according  to  MIP  (4-1).  As  one 
can  see,  the  optimal  ft  is  from  1.8  to  2.5  no  matter  how  large  the  constant  b  is,  the 
exponential  factor  ft  is  no  less  than  1 .8.  (Note  that  the  curve  is  invariant  for  distinct 
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network  sizes  since  the  effect  of  light-tailed  elements  in  riemann  zeta  function  can  be 
neglected.) 


Cost  Constant  b  in  Maintenance  Costs 

Figure  4-1.  Optimal  Robust  Communication  Networks 


Fig.  4-2  reports  that  the  optimal  range  of  (3  is  from  2.3  to  2.4  in  social  networks 
according  to  MIP  (4-2).  We  observe  that  the  increase  of  b  does  not  really  affect  the 
range  of  (3  and  the  curve  also  remains  invariant  with  respect  to  different  network  sizes. 


PP 


CD 


2.35 


2.34 


2.33 


2.32 


&  2.31 
w 


2.3 


1  2  3  4  5  6  7 

Cost  Constant  b  in  Maintenance  Costs 


Figure  4-2.  Optimal  Robust  Social  Networks 


In  summary,  the  analysis  on  both  communication  networks  and  social  networks 
give  us  a  reasonable  explanation  of  the  topology  in  real-world  power-law  networks,  that 
is,  the  best  range  of  the  exponential  factor  /3  is  [1.8, 2.5].  In  other  cases,  the  network 
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maintenance  cost  either  becomes  very  expensive  when  f3  <  1.8,  or  the  network 
robustness  is  unpredictable  when  f3  >  2.5  due  to  its  dependence  on  the  specific 
attacking  strategy. 

4.2  Critical  Elements  Detection  in  Power-law  Networks 

In  this  section,  we  study  two  practical  optimization  problems  namely  Critical  Link 
and  Node  Disruptor  (CLD  and  CND),  to  assess  the  network  vulnerability  when  a  given 
number  of  network  elements  (links  or  nodes)  fail  undesirably.  We  refer  to  these  elements 
as  critical  links  and  nodes  hereafter. 

Definition  23  (Critical  Link  Disruptor).  Given  an  integer  k  and  a  weighted  undirected 
graph  G  =  (V,  E,  1/1/),  the  problem  asks  for  a  weight-bounded  subset  of  critical  links 
S  c  E,  i.e.  E(ij)es  wu  -  k’  wflose  removal  minimizes  the  total  pairwise  connectivity  in 
G[E\S]. 

Definition  24  (Critical  Node  Disruptor).  Given  an  integer  k  and  a  weighted  undirected 
graph  G  —  (V,  E,  W),  the  problem  asks  for  a  weight-bounded  subset  of  critical  nodes 
S  c  V,  i.e.  Zv.es  w'  <  k,  whose  removal  minimizes  the  total  pairwise  connectivity  in 
G[V  \S], 

Moreover,  taking  into  account  the  cascading  failures,  we  define  another  problem 
focusing  on  the  detection  of  critical  nodes,  called  Cascading  Vulnerability  Node  Detec¬ 
tion  (CVND)  problem,  as  follows: 

Definition  25  (Cascading  Critical  Node  Disruptor).  Given  two  integers  k,  d,  a  fractional 
number  6  e  (0, 1)  and  an  undirected  graph  G  —  (V,  £).  Let  P(5)  be  total  pairwise 
connectivity  of  residual  graph  G  after  the  d-hop  cascading  failures  caused  by  the  initial 
removal  of  the  set  of  nodes  S  e  V.  The  CVND  problem  asks  for  k  most  vulnerable 
nodes  such  that  P(S)  is  minimized. 

4.2.1  Hardness  of  Detecting  Critical  Links  and  Nodes 

In  this  subsection,  we  show  that  CLD  and  CND  problems  are  NP-hard,  which 
denies  the  existence  of  a  prompt  optimal  solution. 
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Lemma  27  (Ferrante  et  al.  [35]).  Let  Gi  =  (\4,  £1)  be  a  simple  graph  with  n  nodes  and 
(3  >  1.  Fora  >  max{4/3,  /3  log  n  +  log(n  +  1)},  we  can  construct  a  power-law  graph 
G  =  G1  u  G2  with  exponential  factor  (3  and  the  number  of  nodes  ea(((3)  by  constructing  a 
bipartite  G2  as  a  maximal  component  in  G. 

Lemma  28.  The  clique  separator  (CS)  problem  (which  is  defined  as  given  an  undi¬ 
rected  graph  G  —  [V,  E),  find  a  minimum  set  of  links  S  c  E  such  that  the  connected 
components  of  G[E\  S ]  are  cliques,  each  has  size  at  least  3)  is  UP -hard. 

Theorem  4.2.  The  CLD  problem  is  NP -hard  on  power-law  graphs  even  if  all  nodes  have 
unit  weights. 

Proof.  Consider  the  decision  version  of  CLD  that  asks  whether  an  undirected  graph 
G  =  (V,  E)  contains  a  set  of  links  S  c  E  of  size  k  such  that  the  pairwise  connectivity 
in  residual  graph  G[E  \  S ]  is  at  most  c  for  a  given  positive  integer  c.  To  prove  that  CLD 
on  power-law  graphs  is  in  A/P-hard,  we  reduce  the  clique  separator  (CS)  to  it.  After 
constructing  a  power-law  graph  G'  =  G  u  Gb  where  the  bipartite  graph  Gb  =  (Ubl  Vb\  Eb) 
is  a  maximal  component  in  G'  according  to  Lemma  27,  we  show  that  there  is  a  CS  of 
size  k  in  G  iff  G'  has  a  CLD  S'  of  size  k'  such  that  the  pairwise  connectivity  of  G'[E'  \  S'] 
is  at  most  c,  where  k'  —  k  +  \Eb\  -  \Mb\  and  c  —  \E\  -  k  +  \Mb\.  Note  that  Mb  is  the  links 
in  the  maximum  matching  of  Gb. 

First,  suppose  S  c  V  is  a  clique  separator  of  G  with  |S|  =  k.  We  have  the  pairwise 
connectivity  in  G[E\S]  to  be  \E\-k  since  all  components  in  this  graph  are  cliques.  Since 
the  maximum  matching  on  Gb  can  be  found  in  polynomial  time  using  Hopcroft-Karp 
algorithm  [44],  the  pairwise  connectivity  on  G'  is  c  after  removing  additional  \Eb\  -\Mb\. 

Conversely,  suppose  that  S'  c  V'  is  a  CLD  of  G'  with  size  k'.  Note  that  S'  =  A  u  Sb, 
where  A  and  Sb  are  CLD  on  G  and  Gb  respectively.  We  show  that  the  number  of  critical 
links  Sb  in  Gb  is  \Eb\  -  \Mb\.  If  \Sb\  <  \Eb\  -  \Mb\,  the  pairwise  connectivity  of  Gb  increases 
by  at  least  two  when  adding  one  more  link  onto  the  maximum  matching.  On  the  other 
hand,  the  removal  of  /  links  on  G  can  reduce  the  pairwise  connectivity  at  most  /  after 
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removing  the  CS  k.  If  \Sb\  >  \Eb\  -  \Mb\,  the  pairwise  connectivity  of  Gb  reduce  by  one 
when  removing  one  more  link  from  the  maximum  matching.  Meanwhile,  a  link  added 
onto  the  residual  graph  of  G  will  increase  the  pairwise  connectivity  at  least  one  if  it 
connects  two  independent  nodes  and  at  least  3  if  it  has  one  endpoint  belonging  to  some 
component  in  the  residual  graph  of  G.  Thus,  we  have  Sb  =  \Eb\  -  \Mb\  and  it  is  easy  to 
verify  that  A  is  a  CS  of  G.  □ 

Theorem  4.3.  The  CND  problem  is  NP -hard  on  power-law  graphs  even  if  all  nodes  have 
unit  weights. 

Proof.  Consider  the  decision  of  CND  that  asks  whether  a  graph  G  =  (V,  E)  contains  a 
set  of  nodes  S  c  V  of  size  k  such  that  the  pairwise  connectivity  in  G[V  \  S]  is  at  most  c 
for  a  given  positive  integer  c.  To  prove  that  CND  on  power-law  graphs  is  in  A/P-hard,  we 
reduce  the  vertex  cover  (VC)  to  it.  Let  an  undirected  graph  G  =  (V,  E)  where  \V\  =  n 
and  a  positive  integer  k  <  n  be  any  instance  of  VC.  We  construct  a  power-law  graph 
G'  =  (V",  E')  as  follows.  First,  for  each  node  v,  e  V  on  graph  G,  we  add  one  additional 
node  u,  onto  it,  which  we  call  G1  and  \4  =  V  u  U  where  U  —  {u,}.  Then  according  to 
Lemma  27,  a  power-law  graph  G'  =  ( V",  E')  can  be  constructed  by  embedding  Gx  and  a 
bipartite  graph  G2  =  ( V2,  V2\  E2)  where  V2,  V- 2  are  two  sets  of  disjoint  nodes  in  G2  and 
a  >  max{4 f3,  [3  log(2n)  +  log(2n  +  1)}  with  some  specific  f3.  Note  that  Vj  and  Vf  are 
marked  gray  and  white  separately  as  shown  in  Fig.  4-3.  We  show  that  there  is  a  VC  of 
size  k  in  G  iff  G'  has  a  CND  S'  of  size  k'  such  that  the  pairwise  connectivity  of  G'[V'  \  S'] 
is  at  most  c,  where  k'  —  k  +  minim1),  m2 1 }  and  c  =  n  -  k. 

First,  suppose  S  e  V  is  a  vertex  cover  of  G  with  |S|  =  k.  Therefore,  G  has  a  vertex 
cover  S  of  size  k  iff  G  u  G2  has  a  vertex  cover  S'  of  size  k  +  minim1!,  \  V£\}  since  VC 
is  polynomial^  solvable  in  any  bipartite  graphs  according  to  Konig's  Theorem  [49], 

Then,  after  removing  S'  from  G',  we  only  have  all  disjoint  links  (v,,  u,)  left  where  v,  S. 
Therefore,  the  pairwise  connectivity  on  power-law  graph  G'  is  n  -  k,  which  is  equal  to  c. 
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A  An  instance  B  Reduced  graph  G'  ==  Gi  U  G2 


Figure  4-3.  An  example  of  CND  reduction  on  PLGs.  For  simplicity,  we  just  draw  the 
nodes  in  G  and  its  newly  added  nodes  and  links. 

Conversely,  suppose  that  S'  c  V  with  |S'|  =  k'  is  a  CND  of  G',  that  is,  the  total 
pairwise  connectivity  of  G'[V'  \  S']  is  at  most  c.  First,  if  u(  g  S',  it  is  easy  to  verify 
that  replacing  u,  with  any  v,  will  further  decrease  the  pairwise  connectivity.  Since 
|S'|  =  k'  =  k  +  minjm1!,  \vi\}>  we  can  easily  modify  S'  to  be  a  vertex  cover  of  G  u  G2, 
where  the  total  pairwise  connectivity  on  G'  is  at  most  c  —  n  —  k.  Thus  S'  n  V  is  a  VC  of 
G.  □ 

4.2.2  HILPR  Approach 

Apart  from  the  above  theoretical  hardness  results  for  CLD  and  CND,  these  two 
problems  are  usually  even  harder  to  be  approximated.  The  pairwise  connectivity  can 
either  remain  0(n2)  for  CLD  in  dense  networks  even  when  k  is  large,  or  reach  0  for 
CND  when  k  is  larger  than  the  size  of  vertex  cover.  In  this  section,  we  present  our 
solution,  a  Hybrid  Iterative  Linear  Programming  Rounding  (HILPR)  algorithm  to  both 
CLD  and  CND  problems.  In  a  big  picture,  HILPR  formulates  CLD  and  CND  under 
Integer  Linear  Programming  (ILP)  formulations,  and  then  solves  them  using  an  iterative 
rounding  technique.  In  addition,  HILPR  also  takes  into  account  the  local  search  and 
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constraint  pruning  techniques  in  order  to  further  improve  its  efficiency  and  reduce  its 
time  complexity. 

4.2.2.1  Integer  linear  programming  formulation 


Critical  Link  Disruptor 

For  each  pair  of  nodes  ij  e  V,  we  define  an  indicator  variable  uj  as: 


0,  otherwise 


1,  if  /  and  j  are  connected 


Then  we  have  the  following  ILP: 


s.t.  uj  +  uJh  -  uhi  <  1  V/J,  h  e  V 


(4-3) 


Wij(  1  -  Uj)  <  k 


(iJ)eE 


Uj  G  {0, 1} 


where  the  objective  is  to  minimize  the  total  pairwise  connectivity.  The  first  constraint 
imposes  the  triangular  connectivity.  That  is,  if  node  /  and  j  are  connected,  node  j  and 
h  are  connected,  node  /  and  h  have  to  be  connected.  The  second  constraint  means 
that  the  total  weight  of  all  deleted  links  has  to  be  at  most  k.  We  note  that  for  an  edge 
(/,_/)  g  £,  if  Uj  —  0  in  the  ILP  solution,  then  that  link  (/,_/)  is  a  critical  link. 

Critical  Node  Disruptor 

For  CND,  we  simply  extend  the  above  IP  formulation  for  CLD  in  (4-3)  as 


S.t.  Vj  +  Vj  H~  Uj  >  1  V (/,_/)  G  E 

Uj  +  ujh  -  uhi  <  1  V/J,  h  G  V 

w'Vj  <  k 
iev 


(4-4) 


Vi  G  {0,  1},  Uj  G  {0,  1} 
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where  v,  is  further  defined  as 


{1,  if  node  /  is  deleted  (i.e.,  critical  nodes) 

0,  otherwise 

The  first  additional  constraint  guarantees  that  at  least  one  endpoint  of  a  link  has 
to  be  deleted  if  its  two  endpoints  are  disconnected  in  the  optimal  solution.  Other 
constraints  are  carried  out  as  CLD.  We  further  constrain  k  in  CND  to  satisfy  Lemma 
29  for  unweighted  graphs  to  avoid  the  zero  pairwise  connectivity,  that  is,  all  nodes  in 
network  are  independent. 

Lemma  29.  For  an  unweighted  graph  G,  the  optimal  pairwise  connectivity  of  CND  is 
larger  than  0  if  k  <  \E\/A,  where  A  denotes  the  maximum  degree  in  G. 

Proof.  We  prove  this  using  the  contradiction  method.  Assume  the  optimal  pairwise 
connectivity  is  0,  that  is  Yijev  Uu  =  0,  we  have  =  0  for  any  single  link  (i,j). 
Therefore,  v,  +  Vj  >  1  according  to  the  first  constraint  in  LP  (4-5).  Hence  we  have 
Eievdivi  >  lEl  by  adding  this  up  for  all  links.  Note  that  we  assumed  k  <  |£|/A,  so 
Yiev  d'vi  -  A  Yiev  v>  <  kA  <  |E|,  which  draws  a  contradiction.  fl 

4.2. 2. 2  Hybrid  iterative  Ip  rounding  algorithm 

The  basic  idea  of  our  HILPR  algorithm  consists  of  three  main  steps:  (1 )  Relaxing 
the  integral  constraints  of  the  above  ILP  to  obtain  the  corresponding  LP;  (2)  Iteratively 
solving  the  LP  by  replacing  k  in  it  with  some  experimental  parameter  7  <  k  and  rounding 
the  corresponding  fractional  solutions  of  which  have  weights  at  most  7  to  integers;  and 
(3)  Performing  the  local  search  to  further  optimize  the  solutions.  The  detailed  description 
is  shown  in  Algorithm  7. 

Specifically,  in  each  iteration,  we  solve  the  LP  after  setting  k  =  7.  Let  u*  and  v* 
be  the  optimal  (fractional)  solutions  after  solving  the  LP  for  the  CLD  and  CND  problems 
respectively.  In  the  CLD  problem,  we  round  X  smallest  variables  u*  to  0  such  that 
YUiex  wu  -  7-  Likewise,  we  round  the  Y  largest  variables  v*  into  1  for  the  CND  problem 
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such  that  ^  y  w'  <  7.  In  the  next  iteration,  the  graph  will  first  be  updated  according 
to  the  previous  rounding  results,  i.e. ,  the  identified  critical  links  or  nodes  will  be  removed 
from  the  graph.  Then  LP  will  be  reformulated  according  the  new  residual  graph.  The 
algorithm  terminates  when  the  total  weight  of  all  identified  critical  links  (or  nodes) 
reaches  to  k. 


Algorithm  7:  HILPR  for  a  given  7 


Input  :  Graph  G  =  (V,  E),  an  integer  k,  7 
Output:  The  set  of  critical  links/nodes  S 

1  S  i —  0; 


2 

3 

4 

5 

6 

7 

8 
9 

10 

11 


12 

13 

14 


//  Iterative  LP  Rounding 

while  k  >  0  do 
if  k  <  7  then 

7  =  km, 

end 

else 

Use  Constraint  Pruning  to  solve  the  LP  formulation  with  7; 

/c  ■< —  /c  —  7; 

end 

S'  <-  X  links  with  smallest  u*  such  that  J2UrGX  wj  <  7  (in  case  of  CND,  S'  <-  Y 
nodes  with  largest  v*  such  that  ^2v.eY  w'  <  7  ); 

S^SUS'; 

G^G[E\S'Y, 

end 


15  //  Local  Search 

16  S*  <-  S ; 

17  foreach  element  e  e  S  do 

is  |  Swapping(e)  (Algorithm  8); 

19  end 

20  Sf-  S*; 

21  return  S; 


In  the  end,  the  HILPR  algorithm  further  deploys  a  meta-heuristic  approach  [38]  to 
enhance  the  solution  S  obtained  by  the  iterative  rounding  step  mentioned  above.  The 
detail  of  this  local  search  is  shown  in  the  last  part  of  Algorithm  7.  Take  a  CLD  as  an 
example.  For  each  link  e  in  the  solution  S,  we  do  the  local  swapping  between  e  and 
each  e'  e  A/(e)  to  obtain  the  new  solution  S'  =  S  u  {e'}  \  {e}.  Let  f(G,  S)  be  the  pairwise 
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connectivity  function  of  G[E  \  S].  If  f{G,  S')  <  f(G,  S),  we  replace  e  by  e'  in  the  solution, 
set  S  to  be  S'  and  recursively  do  the  local  search  on  e'.  The  recursive  procedure  stops 
until  no  more  improvement  can  be  achieved  by  the  local  search,  i.e.,  f(G,  S')  >  f(G,  S). 
The  whole  algorithm  stops  until  all  links  in  S  are  visited.  Similarly,  the  local  search  on 
CND  can  be  achieved  by  recursively  checking  the  neighbor  nodes. 


Algorithm  8:  Swapping(e) 

1  S*  t —  S*  \  {e}; 

2  if  Be'  e  N(e)  such  that  f(G,  S')  <  f(G,  S)  where  S'  <-  S  \  {e}  u  {e'}  then 

3  S*  «-  S'; 

4  Swapping(e') ; 

5  end 


We  note  that  the  LP  formula  of  CLD  and  CND  each  has  0(n3)  constraints  owing  to 
the  triangle  inequality  constraints.  To  improve  the  running  time  of  our  algorithm  during 
solving  the  LP,  we  further  propose  the  constraint  pruning  technique  to  eliminate  the 
inactive  constraints  according  to  the  following  lemma.  As  a  result,  the  number  of  active 
constraints  on  triangle  inequality  in  equation  (4-3)  and  (4-5)  can  be  reduced  to  0(n2) 
according  to  the  constraint  pruning  technique. 


Figure  4-4.  Triangle  inequality  constraints 

Consider  a  four-tuple  triangle  inequality  (/,_/,  k),  (i,j,  h),  (/,  k,  h)  and  (J,  k,  h),  all 
constraints  are  satisfied  at  the  beginning.  That  is,  as  shown  in  Fig.  4-4,  tvy  +  uJh  -  uhl  <  1 
and  uJk  +  uhk  -  uJh  <  1  for  the  tuple  (/,_/,  h)  and  (j,  k,  h)  respectively.  In  the  case  that 
the  triangle  inequality  of  the  tuple  (/,_/,  k)  is  tight,  shown  as  shadow  in  Fig.  4-4,  that  is, 
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Uj  +  uJk  -  uik  —  1,  we  have 


uhi  >  uij  +  uJh  -  1  >  u,j  +  uJk  +  ukh-  2  =  uik  +  ukh  -  1 

Thus,  the  triangle  inequality  of  the  tuple  (/,  k,  h )  is  satisfied,  shown  as  bold  in  Fig.  4-4. 
Once  the  triangle  inequality  of  the  tuple  (/,_/,  k )  is  tight,  the  triangle  inequality  of  the 
tuple  (/,  k,  h )  will  be  satisfied  for  all  nodes  h.  Since  the  number  of  triangle  inequality 
constraints  is  3(3)  =  0(n3),  the  number  of  active  constraints  will  be  0(n3)/n  =  0(n2) 
after  pruning  process. 

4.2.2.3  Performance  evaluation 

Performance  of  the  HILRP  Algorithm 

The  three  networks  we  use  to  evaluate  the  performance  of  our  proposed  HILPR 
algorithm  are  described  as  follows: 

1 .  The  real  terrorist  network  compiled  by  Krebs  [57]  with  62  nodes  and  1 53  links, 
which  reflects  the  relationship  between  the  terrorists  involved  in  the  terrorism 
attacks  of  Sep.  1 1 , 2001 .  This  experiment  attempts  to  evaluate  the  performance  of 
HILPR  on  a  real-world  social  network.  In  order  to  breakdown  the  terrorist  network, 
we  can  capture  the  individuals  corresponding  to  the  critical  nodes  identified  by 
HILPR. 

2.  Waxman  network  topology,  a  widely-accepted  Internet  AS  topological  model,  is 
generated  by  the  well-known  BRITE  [64], 

3.  Power-law  network  topology,  generated  by  Barabasi  graph  generator  [1],  has  been 
discovered  as  one  of  the  most  remarkable  properties  in  many  large-scale  networks 
such  as  the  Internet  and  the  social  networks. 

To  keep  the  similar  density  as  the  real  terrorist  network  and  also  show  the  comparison 

with  optimal  solutions,  we  use  the  instance  with  70  nodes  and  140  links.  We  generate 

100  instances  for  both  Waxman  and  power-law  models  and  show  the  average  results. 

In  order  to  show  the  effectiveness  of  our  proposed  HILPR  algorithm,  we  compare  it 
with  the  optimal  solution  obtained  by  solving  the  ILP  directly.  We  also  compare  HILPR 
with  two  centrality  approaches:  degree  centrality  (DC)  and  betweenness  centrality  (BC), 
which  are  often  used  in  network  analysis  [17].  In  DC,  the  k  links  and  nodes  of  largest 
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degrees  are  selected  as  critical  links  and  nodes,  where  the  degree  of  a  link  (u,  v)  is 
defined  as  d(u)  +  d(v).  Similarly,  in  BC,  the  k  links  and  nodes  with  largest  betweenness 
are  selected  as  critical  links  and  nodes,  where  the  betweenness  of  a  link  or  a  node  is 
defined  as  the  number  of  shortest  paths  among  all  pairs  of  nodes  that  passes  through  it. 
For  CND,  we  further  compare  HILPR  with  CNLS  approach  proposed  by  Arulselvan  et  al. 
[9],  which  also  aims  to  minimize  the  pairwise  connectivity. 

As  the  only  free  parameter  in  our  algorithm,  we  first  compare  the  impacts  of 
different  7  values  in  our  experiments  such  that  we  can  balance  the  solution  quality  and 
running  time  by  carefully  selecting  this  experimental  value  7.  As  illustrated  in  Fig.  4-5, 
the  results  returned  by  our  algorithm  are  very  close  solutions.  Thus,  we  use  7  =  1  for 
CND  due  to  its  slightly  better  performance,  and  7  =  5  for  CLD  to  reduce  the  running  time 
since  the  number  of  critical  links  is  usually  larger  compared  with  critical  nodes.  Next,  we 
show  that  our  HILPR  approach  returns  a  very-near  optimal  solution  and  outperforms 
other  approaches. 


A  Critical  Links 


B  Critical  Nodes 


Figure  4-5.  The  performance  of  HILPR  using  different  7  in  terrorist  network 


Fig.  4-6  and  Fig.  4-7  report  the  comparison  of  the  above  HILPR  algorithm  and 
centrality  algorithms  for  CLD  and  CND  on  the  above  three  different  networks.  In  these 
figures,  we  notice  that  the  solution  of  HILPR  algorithm  is  very  closely  approaching 
the  optimal  solution  for  both  CLD  and  CND  on  all  these  three  networks  (Note  that 
a  portion  of  optimal  solutions  of  CND  in  Waxman  networks  are  missing  due  to  its 
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A  Waxman  networks  B  Power-law  networks  C  Terrorist  network 

Figure  4-6.  The  performance  evaluation  of  HILPR  against  the  degree  and  betweenness 
centrality  algorithms  for  the  CLD  problem 
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A  Waxman  networks 


B  Power- 1  aw  networks 


C  Terrorist  network 


Figure  4-7.  The  performance  evaluation  of  HILPR  against  the  degree  and  betweenness 
centrality,  and  CNLS  algorithms  for  the  CND  problem 


extremely  high  computational  complexity,  which  is  because  the  network  is  neither 
almost  intact  nor  almost  fragmented).  The  pairwise  connectivity  derived  from  degree 
centrality  algorithms  is  much  worse  than  HILPR  algorithm  because  the  links  or  nodes 
of  higher  degrees  could  already  connect  other  critical  links  or  nodes  and  therefore  are 
not  necessary  to  be  counted  as  critical  any  more.  For  instance,  the  hub  nodes  (nodes 
of  high  degree)  in  power-law  networks  are  not  necessarily  connected  with  each  other 
such  that  the  removal  of  two  hub  nodes  could  be  less  effective  to  reduce  the  pairwise 
connectivity  than  the  removal  of  two  other  nodes  which  can  disconnect  the  network.  The 
betweenness  centrality  performs  worst  due  to  the  lack  of  all  paths  information  rather 
than  only  shortest  paths.  That  is,  a  pair  of  nodes  can  still  be  connected  even  when  only 
the  shortest  path  between  them  is  destroyed.  The  reason  why  our  HILPR  algorithm 
outperforms  CNLS  is  mainly  because  of  the  different  strategy  to  choose  the  initial  critical 
nodes  before  doing  the  local  search.  The  CNLS  method  is  only  based  on  the  maximum 
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degree  from  a  maximal  independent  set,  hoping  that  the  removal  of  these  nodes  can 
greatly  fragment  the  network.  However,  this  is  not  always  true  since  many  nodes  in 
the  maximal  independent  set  are  usually  of  low  degree,  and  consequently  do  not  play 
an  important  role  in  destroying  the  network.  In  our  HILPR  approach,  by  solving  the  LP 
and  rounding  the  top  elements  iteratively,  we  take  into  account  all  possible  paths  and 
connections  between  different  nodes  such  that  the  critical  elements  can  be  accurately 
identified. 


Figure  4-8.  Overlapping  critical  nodes  between  optimal  solution  and  HILPR  in  terrorist 
network 

Specifically,  in  order  to  further  show  the  effectiveness  of  our  metric  and  algorithm, 
i.e.,  the  critical  links  and  nodes  in  real-world  networks  can  be  correctly  detected  using 
our  algorithm,  we  dig  into  the  real  terrorist  network  in  which  the  identities  of  nodes  are 
available.  The  results  returned  by  our  HILPR  algorithm  show  that  we  can  detect  the 
real  important  personnel  by  minimizing  total  pairwise  connectivity.  For  instance,  the  two 
nodes  37  and  48  in  the  terrorist  network,  which  have  been  shown  as  the  leaders  in  [57? 
],  can  be  correctly  detected  using  our  HILPR  algorithm  as  long  as  k  =  2.  Yet,  if  we  use 
degree  and  betweenness  centrality  methods,  only  node  37  can  be  detected  when  k  =  2 
and  node  48  will  not  be  detected  until  k  is  chosen  to  be  6  and  5  respectively. 
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Even  though  our  objective  is  to  minimize  the  pairwise  connectivity,  we  are  still 
interested  in  the  overlapping  percentage  of  critical  nodes  our  HILPR  algorithm  returns 
and  the  optimal  critical  nodes.  As  reported  in  Fig.  4-8,  in  the  real  terrorist  network, 
optimal  critical  nodes  can  be  100%  successfully  detected  using  our  HILPR  algorithm 
in  more  than  1/3  cases  for  different  k  values.  The  average  overlapping  percentage  is 
around  80%  since  there  exist  some  nodes  playing  the  same  role  in  network  connectivity 
such  that  the  pairwise  connectivity  still  can  be  minimized  although  our  HILPR  algorithm 
identifies  different  critical  nodes  from  the  optimal  solution. 

Moreover,  the  running  time  of  our  HILPR  algorithm  is  less  than  5  seconds  in  all 
these  three  networks,  for  detecting  either  critical  links  or  nodes,  which  is  only  slightly 
worse  than  centrality  algorithms  (1-2  seconds)  and  CNLS  algorithm  (2-3  seconds). 
Especially  when  k  is  small,  i.e.,  only  the  most  critical  elements  are  required  to  be 
detected,  our  algorithm  can  finish  around  3  seconds,  which  further  illustrates  the 
effectiveness  of  our  HILPR  algorithm  in  terms  of  both  solution  quality  and  running  time. 

Metric  Evaluation 

We  evaluate  the  residual  network  obtained  by  HILPR  algorithm  under  various 
network  vulnerability  metrics.  As  has  been  shown  in  Fig.  4-6  and  4-7,  degree  centrality 
and  betweenness  centrality  cannot  accurately  reflect  the  network  vulnerability. 

Therefore,  we  focus  on  the  following  three  other  metrics:  (1)  average  shortest  path 
length  (ASP)  between  each  node-pairs  (the  shortest  distance  is  0  if  the  pair  of  nodes 
are  not  connected),  (2)  average  available  flows  (AAF)  between  each  node-pairs,  and  (3) 

global  clustering  coefficients  (GCC)  defined  as  feonnecSTrfples'of  vertices' in  which  a 
closed  triplet  consists  of  three  nodes  that  are  connected  by  three  undirected  ties. 

Particularly,  we  are  interested  to  see  how  the  values  of  these  metrics  change  in  the 
residual  network  after  we  remove  the  critical  elements  which  can  successfully  reduce 
the  pairwise  connectivity  of  the  network.  Since  our  HILPR  algorithm  can  successfully 
detect  the  real  critical  links  and  nodes  as  discussed  in  the  previous  subsection,  we 
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confidently  evaluate  the  above  three  metrics  on  the  residual  graphs  obtained  by  HILPR 
algorithm.  Fig.  4-9  shows  the  changes  in  values  of  the  above  three  metrics  after 
removing  different  number  of  critical  links  or  nodes.  Unfortunately,  none  of  these  three 
metrics  in  residual  networks  can  clearly  cast  the  network  vulnerability.  As  for  the  ASP, 
we  can  only  consider  the  ASP  within  each  connected  component  after  the  network  is 
disconnected;  otherwise  the  ASP  becomes  infinite  and  therefore  fails  to  measure  the 
network  vulnerability.  However,  the  value  of  ASP  within  connected  components  is  either 
irregular  (Fig.  4-9A)  or  contrary  to  the  intuition,  i.e.,  ASP  usually  increases  with  after 
removing  critical  elements  (Fig.  4-9B).  Similarly,  the  AAF  fails  to  assess  the  network 
vulnerability  due  to  its  irregularity  for  critical  links.  The  monotonous  decease  of  AAF  in 
the  residual  networks  after  removing  critical  nodes  is  greatly  due  to  the  disconnection  of 
the  network,  which  reduces  the  flow  from  two  nodes  in  different  connected  components 
to  0.  However,  the  nodes  disconnecting  the  network  are  not  necessary  to  be  critical 
nodes.  At  last,  the  variation  of  GCC  values  is  irregular  for  both  critical  links  and  nodes 
due  to  the  simultaneous  decrease  of  the  number  of  connected  triples  of  vertices. 
Particularly,  when  the  network  is  highly  fragmented,  this  metric  can  easily  become 
infinite  and  meaningless  (in  the  residual  network  of  k  >  22  as  shown  in  Fig.  4-9B)  since 
the  number  of  connected  triples  of  vertices  becomes  0. 


A  Critical  Links 


B  Critical  Nodes 


Figure  4-9.  The  comparison  of  different  metrics  on  terrorist  network 
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4.2.3  TRGA  Approach  under  Cascading  Failures 

As  illustrated  in  Section  3.7,  the  cascading  failures  could  lead  to  the  entire  different 
set  of  critical  elements.  In  this  subsection,  we  propose  our  solution,  a  Traceback  and  LP 
Rounding-Greedy  Algorithm  (TRGA)  for  solving  the  CCND  problem. 

4.2.3.1  TRGA:  an  iterative  2-phase  algorithm 

In  a  big  picture,  TRGA  algorithm  (Algorithm  9)  iteratively  detects  the  most 
vulnerable  nodes  until  k  most  vulnerable  nodes,  in  which  each  iteration  is  two-fold: 

(1)  identifying  the  ultimate  failure  nodes  after  cascading  failures;  (2)  tracing  back  the 
vulnerable  nodes  based  on  the  above  failure  nodes.  In  addition,  TRGA  also  takes  into 
account  the  lazy-update  and  constraint  pruning  techniques  in  each  iteration  further 
reduce  its  time  complexity.  In  the  end,  a  local  search  is  provided  to  improve  the  solution 
quality.  The  rest  of  this  subsection  discusses  two  steps  in  each  iteration  in  detail. 

Phase  1 :  Ultimate  Failure  Nodes  Identification  In  order  to  detect  the  ultimate  failure 
nodes,  the  idea  is  to  first  guess  the  extent  of  fragmentation  in  residual  networks,  i.e., 
the  number  of  connected  node-pairs  at  last,  and  then  identify  these  nodes  based  on 
the  iterative  rounding  approach  in  [77],  which  has  been  show  to  be  one  of  the  best 
approaches  for  detecting  critical  nodes  when  failures  are  not  cascaded.  Denoting  the 
residual  pairwise  connectivity  as  fp,  we  estimate  it  based  on  the  following  intuition  and 
observation:  the  larger  the  degree,  the  more  vulnerable  the  node  is  in  a  network  [84], 
Therefore,  in  each  iteration,  we  choose  the  k'  highest  degrees  in  the  residual  network 
{k'  =  k-  #detected  vulnerable  nodes)  to  simulate  the  cascading  failures  after  deleting 
them  and  obtain  the  pairwise  connectivity  <p.  Then,  we  have  the  following  Integer  Linear 
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Programming  (ILP)  formulation: 

min  ^  Vj 

iev 

S.t.  Vj  H-  Vj  -T  Ujj  >  1  V(/,j)  G  £ 

u,j  +  ujh  -  uhi  <  1  V/,7,  h  e  V  (4-5) 

£  <  v 

i  je  v 

Vi  e  {o,  l},  Ujj  g  {0, 1} 
where  i/(-  is  further  defined  as 

{1,  if  node  /  is  deleted  (i.e.,  vulnerable  nodes) 

0,  otherwise 
and 

{1,  if  /  and  j  are  connected 
0,  otherwise 

The  first  additional  constraint  guarantees  that  at  least  one  endpoint  of  a  link  has  to 
be  deleted  if  its  two  endpoints  are  disconnected  in  the  optimal  solution.  The  second 
constraint  imposes  the  triangular  connectivity  while  the  third  constraint  means  that 
the  total  pairwise  connectivity  after  cascading  failures  has  to  be  at  most  To  solve  it 
effectively,  we  borrow  the  idea  in  [77]  to  relax  the  equation  (4-5),  iteratively  solve  the  LP 
and  round  the  largest  vf.  Likewise,  we  further  apply  constraints  pruning  in  solving  the  LP 
and  local  search  at  the  end  of  the  whole  ultimate  failure  nodes  identification. 

Phase  2:  Vulnerable  Nodes  Tracing  Back  With  the  set  of  ultimate  failure  nodes 
after  cascading  failures,  we  trace  back  to  the  vulnerable  nodes  based  on  the  following 
greedy  algorithm.  In  particular,  in  each  iteration,  we  select  a  node  which  can  lead  to  the 
collapse  of  most  ultimate  failure  nodes,  call  cascading  influence,  by  simulating  the  failure 
cascades  after  removing  each  node. 
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To  obtain  the  cascading  influence  of  each  node  at  each  iteration,  the  easiest 
approach  is  to  recompute  for  each  node,  yet  this  approach  is  extremely  time-consuming. 
Instead,  we  apply  the  lazy-update  process  after  the  initial  failure  influence.  Specifically, 
after  the  simulation  at  the  first  iteration,  we  maintain  a  max  priority  queue  Q  in  which 
the  priority  is  their  cascading  influence.  In  each  iteration,  the  node  u  with  the  highest 
cascading  influence  is  extracted  and  we  recompute  the  extra  nodes  needed  to  fail  u. 

In  the  next  iteration,  u  will  be  selected  if  it  still  has  the  highest  priority.  Otherwise,  u  is 
pushed  back  to  the  priority  queue,  meanwhile  the  new  node  with  the  highest  priority  will 
be  picked.  Note  that  if  the  number  of  selected  vulnerable  nodes  are  larger  than  k',  we 
will  choose  the  k'  highest  degrees  in  the  residual  network  from  previous  round  instead 
and  move  on  to  the  local  search  phase  directly. 

4.2. 3.2  Optimality  of  CCND  problem 

In  this  subsection,  we  propose  the  following  Integer  Linear  Programming  (ILP) 
formulation  for  CCND  problem  in  order  to  obtain  its  optimal  solution.  Next,  we  apply  a 
sparse  metric  technique  to  further  reduce  the  number  of  constraints,  meanwhile  keep 
the  same  optimal  result. 

Mathematical  Formulation 

For  each  pair  of  nodes  ij  e  V,  we  define  an  indicator  variable  Uj  as: 

{1,  if  /  and  j  are  connected 
0,  otherwise 

and  for  all  integers  t  e  [0,  cl],  we  define 

{1,  if  node  /  fails  in  round  t 
0,  otherwise 
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Note  that  vf  —  1  when  node  /  is  a  vulnerable  node  and  fails  at  the  beginning.  Then  we 
have  the  following  ILP: 

min  ^  Ujj 

i,j£  V 

s.t.  vf  +  vf  +  uj  >  1  V(/J)  e  E 

Uj  +  uJh  -  uhi  <  1  ViJ,  h  e  V 

J2v"<k 

iev 

E  vf'1  +  0  ■  deg(v,)V!-'  (4_6 

JeN(vj) 

>  6  ■  deg{vj)vf  Vie  V,  VO  <  t  <  d 

vf  >  v^1  VO  <  t  <  d 

Vie  V ,  0  <  t  <  d 

vf  e  {0, 1}  VO  <  t  <  d 

uj  e  {0,1} 

where  the  objective  is  to  minimize  the  total  pairwise  connectivity.  The  first  constraint 
guarantees  that  at  least  one  endpoint  of  a  link  has  to  be  deleted  after  d  round  cascades 
if  its  two  endpoints  are  disconnected  in  the  optimal  solution.  The  second  constraint 
imposes  the  triangular  connectivity.  That  is,  if  node  /  and  j  are  connected,  node  j  and 
h  are  connected,  node  /  and  h  have  to  be  connected.  The  third  constraint  means  that 
the  total  pairwise  connectivity  after  d  round  failure  cascades  is  at  most  f3  fraction  of  all 
node-pairs.  The  last  two  constraints  deals  with  the  cascades  process  and  keeps  failed 
nodes  to  be  failure  in  the  following  rounds  respectively. 

4.2.3.3  Experimental  evaluation 

In  this  section,  we  evaluate  the  performance  of  our  TRGA  algorithm  on  different 
types  of  synthetic  and  real-world  networks.  The  simulation  is  implemented  using  the 
CPLEX  optimization  suite  from  ILOG,  which  includes  the  simplex  method  [42],  the 
branch  &  bound  algorithm,  and  advanced  cutting-plane  techniques  [86]. 
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The  three  networks  we  use  to  evaluate  the  performance  of  our  proposed  TRGA 
algorithm  are  described  as  follows:  (1)  US  Network  Assets  compiled  by  [77]  with  71 
nodes  and  98  edges,  which  provides  the  current  customer  needs  in  XO  Communications 
service.  This  experiment  attempts  to  evaluate  the  performance  of  TRGA  on  a  real-world 
communication  network.  In  order  to  maintain  the  functionality  of  this  communication 
network,  we  need  to  protect  the  most  critical  ISPs  corresponding  to  the  vulnerable 
nodes  identified  by  TRGA;  (2)  Power-law  network  topology  generated  by  igraph  library 
[26]  using  the  model  in  [2],  with  /3  =  1.8  and  70  nodes;  (3)  Small-world  network  topology 
generated  by  igraph  library  [26]  using  Watts  and  Strogatz  model  in  [70],  with  k  =  2, 

H  =  0.2  and  70  nodes.  The  selection  of  parameters  in  these  two  synthetic  networks 
is  to  keep  the  similar  density  as  the  US  Network  Assets  network  and  also  show  the 
comparison  with  optimal  solutions.  We  generate  100  instances  for  both  power-law  and 
small-world  networks  and  show  the  average  results. 

In  order  to  show  the  effectiveness  of  our  proposed  TRGA  algorithm,  we  compare  it 
with  the  optimal  solution  obtained  by  solving  the  ILP(4-6)  directly.  We  also  compare 
TRGA  with  two  centrality  approaches:  degree  centrality  (DC)  and  betweenness 
centrality  (BC),  which  are  often  used  in  network  analysis  [17].  In  DC,  the  k  nodes  of 
largest  degrees  are  selected  as  vulnerable  nodes,  and  in  BC,  the  k  and  nodes  with 
largest  betweenness  are  selected  as  vulnerable  nodes  obtained  using  [19],  where  the 
betweenness  of  a  node  is  defined  as  the  number  of  shortest  paths  among  all  pairs  of 
nodes  that  passes  through  it. 

Fig.  4-10  reports  the  comparison  of  the  above  TRGA  algorithm  and  centrality 
algorithms  for  CCND  on  the  above  three  different  networks.  In  these  figures,  we  notice 
that  the  solution  of  TRGA  algorithm  is  very  closely  approaching  the  optimal  solution  for 
both  CCND  on  all  these  three  networks.  Except  in  power-law  networks  in  which  nodes 
of  high  degrees  have  been  shown  as  important  nodes  [84],  the  pairwise  connectivity 
derived  from  degree  centrality  algorithms  is  much  worse  than  TRGA  algorithm  especially 
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Figure  4-10.  The  performance  evaluation  of  TRGA  against  degree  and  betweenness 
centrality  algorithms  for  the  CCND  problem 


in  small-world  networks  due  to  their  homogeneity  in  node  degrees  such  that  the  nodes 
of  higher  degrees  could  already  connect  other  vulnerable  nodes  and  therefore  are 
not  necessary  to  be  counted  as  vulnerable  any  more.  The  betweenness  centrality 
performs  worst  in  both  power-law  networks  and  US  Network  Assets  due  to  the  lack 
of  all  paths  information  rather  than  only  shortest  paths.  That  is,  a  pair  of  nodes  can 
still  be  connected  even  when  only  the  shortest  path  between  them  is  destroyed.  Yet, 
it  outperforms  degree  centrality  in  small-world  networks,  in  which  the  difference  of 
degrees  is  not  substantial.  In  our  TRGA  approach,  in  the  first  phase  of  each  iteration, 
by  solving  the  LP  and  rounding  the  top  elements  iteratively,  we  take  into  account 
all  possible  paths  and  connections  between  different  nodes  such  that  the  critical 
elements  can  be  accurately  identified.  Meanwhile,  the  second  phase  in  TRGA  with  the 
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back-tracing  can  also  precisely  detect  the  original  vulnerable  nodes  by  providing  more 
information  than  only  degree  or  betweenness.  Moreover,  the  running  time  of  our  TRGA 
algorithm  is  less  than  10  seconds  (due  to  the  LP  solver)  in  all  these  three  networks,  for 
detecting  vulnerable  nodes,  which  is  acceptable  compared  with  centrality  algorithms  (1-2 
seconds),  and  over  100  times  faster  than  obtaining  optimal  solution  even  with  sparse 
metric.  Besides,  US  Networks  Assets  as  a  well-designed  communication  network  in 
practice,  even  with  lower  density,  is  shown  to  be  the  most  robust  among  these  three 
topologies. 

4.3  Related  Works 

Many  existing  works  on  network  vulnerability  assessment  mainly  focus  on  the 
centrality  measurements  [17],  including  degree,  betweenness  and  closeness  centralities, 
average  shortest  path  length  [3],  global  clustering  coefficients  [60]. 

Due  to  the  failures  to  assess  the  network  vulnerability  using  above  measurements, 
Sun  et  al.  [80]  first  proposed  the  total  pairwise  connectivity  as  an  effective  measurement 
and  empirically  evaluate  the  vulnerability  of  wireless  multihop  networks  using  this 
metric.  Arulselvan  et  al.  [9]  showed  the  challenge  of  CND  problem  by  proving  its 
NP-completeness.  Later  on,  the  /3-disruptor  problem  was  defined  by  Dinh  et  al.  [27] 
to  find  a  minimum  set  of  links  or  nodes  whose  removal  degrades  the  total  pairwise 
connectivity  to  a  desired  degree.  They  proved  the  NP-completeness  of  this  problem 
with  respect  to  both  links  and  nodes  and  the  corresponding  inapproximability  results. 
Even  for  the  tree  topology,  Di  Summa  et  al.  [61]  found  that  the  discovery  of  critical  nodes 
also  remains  NP-complete  using  this  metric.  In  this  paper,  we  further  investigate  the 
theoretical  hardness  of  both  CLD  and  CND  on  UDGs  and  PLGs. 

In  addition,  there  are  a  few  effective  solutions  in  the  literature  of  the  network 
vulnerability  assessment  based  on  the  pairwise  connectivity.  Arulselvan  et  al.  [9] 
designed  a  heuristic  (CNLS)  to  detect  critical  nodes,  which  is  however  still  far  away  from 
the  optimal  solution  in  large-scale  and  dense  networks.  In  [27],  Dinh  et  al.  proposed 
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pseudo-approximation  algorithms  to  solve  the  /3-disruptor  problem.  However,  this 
problem  is  defined  differently  than  ours  and  hard  to  use  its  solution  when  we  only  know 
the  available  cost  to  destroy  or  protect  these  critical  links  or  nodes. 

When  failures  are  cascaded,  these  results  are  no  longer  valid,  in  which  the 
vulnerability  of  networks  could  be  substantially  different.  Most  of  the  works  regarding 
cascading  failures  mainly  focus  on  models  [25,  45,  85].  Moreover,  there  are  some  other 
papers  providing  some  experimental  analysis  [29,  67],  Unfortunately,  the  theoretical 
works  are  lacked,  which  are  crucial  to  the  network  design  and  proactive  protection. 
Therefore,  we  provide  a  probabilistic  analysis  to  assess  the  vulnerability  for  complex 
networks  in  the  case  of  cascading  failures,  leading  to  deep  insights  to  the  robustness  of 
various  networks  under  random  failures. 

In  addition,  most  of  works  on  network  vulnerability  assessment  for  adversarial 
attacks  are  also  studied  without  taking  into  account  the  cascading  failures.  Besides  the 
widely-used  centrality  measurements  [3,  17,  60],  Arulselvan  et  al.  [9]  first  proposed  the 
total  pairwise  connectivity  as  an  effective  measurement,  based  on  which  they  propose 
the  CND  problem  and  designed  a  heuristic  to  detect  critical  nodes.  The  /5-disruptor 
problem  was  later  defined  by  Dinh  et  al.  [27]  followed  by  pseudo-approximation 
algorithms.  Unfortunately,  these  approaches  fail  to  accurately  identify  the  vulnerable 
nodes  in  the  presence  of  cascading  failures.  In  this  paper,  we  further  investigate  the 
theoretical  hardness  the  CVND  problem,  along  with  an  effective  algorithm. 
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Algorithm  9:  TRGA  Algorithm 


Input  :  Network  G,  Threshold  6 
Output:  The  set  of  k  vulnerable  nodes  S 

1  k'  i —  k] 

2  S  4 —  0; 

3  while  \S\  <  k  do 


4 

5 

6 
7 


k'  <-  k-  |S|; 

D  i—  k'  largest  degree  nodes  in  G[V  \  S]; 

<p  g-  #failed  nodes  after  cascading  failures  by  removing  D  from  G[V  \  S]; 

G  •*-  0; 


8 

9 

10 

11 

12 

13 

14 

15 


//  Ultimate  Failure  Nodes  Identification 

while  Pairwise  Connectivity  >  <p  do 

Use  Constraint  Pruning  in  [77]  to  solve  the  LP  formulation  with  Q3; 
<p  <—  disconnected  node-pairs  after  removing  tv; 
u  <r-  the  node  with  largest  vf\ 

U<-U\J{u}i 
G  <-  G[V  \  {tv}]; 

end 


16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 


//  Vulnerable  Nodes  Tracing  Back 
Q  <—  0; //  Priority  Queue 
S' «-  0; 

while  3  one  node  does  not  fail  do 

if  Q  —  0  then 

foreach  node  u  do 

Calculate  the  cascading  influence  after  removing  u  from  G; 

end 

Construct  Q  based  on  cascading  influence  of  each  node; 

end 

else 

S' S'u  the  node  in  Q  with  max  priority; 

Update  cascading  influence  caused  by  removing  this  node; 

end 

end 

if  |  S'  |  >  k'  then 

S  Su  k'  largest  degree  nodes  in  G[V  \  S]; 

end 

else 

S  <-  S  U  S'; 

end 


37  end 


38  //  Local  Search 

39  S*  <-  S ; 

40  foreach  node  tv  g  S  do 

41  Swapping(tv);  (Algorithm  2  in  [77]  by  replacing  f(G,  S')  with  the  pairwise 
connectivity  function  of  residual  graph  G  after  removing  S'); 

42  end 


43  S  i —  S*;  108 

44  return  S; 


CHAPTER  5 
CONCLUSION 

In  this  dissertation,  we  first  analyzed  the  approximation  hardness  and  inapproximability 
of  optimal  substructure  problems  on  power-law  graphs.  These  problems  are  only 
illustrated  in  the  literature  not  be  able  to  approximated  into  some  constant  factors 
on  both  general  and  simple  power-law  graphs  although  they  remain  APX-hard.  On 
the  contrary,  we  also  show  that  Max  Clique  and  Graph  Coloring  are  still  very  hard 
to  be  approximated  since  the  optimal  solutions  to  these  problems  are  dependent  on 
the  structure  of  local  graph  component  rather  than  global  graph.  In  other  words,  the 
power-law  distribution  in  degree  sequence  does  not  help  much  for  such  optimization 
problems  without  the  property  of  optimal  substructure.  Moreover,  we  proposed  a 
algorithm  framework,  along  with  a  theoretical  framework  for  analyzing  approximation 
ratios,  based  on  the  idea  of  percolating  the  power-law  graph  from  the  nodes  of  lowest 
degree  to  other  nodes. 

In  addition,  we  study  the  robustness  of  power-law  networks  under  various  threats, 
i.e.  random  failures,  preferential  attacks  and  degree-centrality  attacks.  Essentially, 
the  power-law  networks  are  illustrated  to  extremely  tolerate  random  failures.  In  the 
meanwhile,  they  are  more  robust  under  both  preferential  attacks  and  degree-centrality 
attacks  if  they  have  a  smaller  exponential  factor  (3.  When  failures  can  be  cascaded,  we 
showed  that  power-law  networks  are  extremely  vulnerable  even  with  very  small  (3. 

In  order  to  provide  an  optimal  design  of  power-law  networks,  we  further  exploit  the 
topologies  of  practical  real-world  networks  by  optimizing  the  costs  and  guaranteeing 
their  robustness.  The  best  range  of  the  exponential  factor  (3  is  illustrated  to  be  [1.8, 2.5], 
which  gives  a  reasonable  explanation  for  the  topologies  of  most  real-world  networks. 

When  (3  <  1.8,  the  network  maintenance  cost  is  very  expensive,  and  when  (3  >  2.5,  the 
network  robustness  is  unpredictable  since  it  depends  on  the  specific  attacking  strategy. 
Also,  we  study  CLD  and  CND  optimization  problems  to  identify  critical  links  and  nodes  in 
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a  network  whose  removals  maximally  destroy  the  network’s  functions.  We  proved  their 
NP-hardness  and  proposed  HILPR,  a  novel  LP-based  rounding  algorithm,  for  efficiently 
solving  CLD  and  CND  problems  in  a  timely  manner.  In  the  present  of  cascading  failures, 
we  further  study  CCND  problem  and  developed  the  effective  iterative  2-phase  TRPA 
algorithm.  The  experiments  on  various  synthetic  and  real-world  networks  illustrated  the 
good  performance  of  our  proposed  approaches. 
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CASCADING  PROPAGATION  AND  OPTIMIZATION  IN  NETWORKS 

By 
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Major:  Computer  Engineering 

Cascading  processes  are  more  and  more  popular  in  highly  connected  networks. 
These  processes  are  recognized  in  a  wide  range  of  networks  with  different  contexts: 
the  information  diffusion  in  online  social  networks,  the  cascading  crisis  in  the  network 
of  banks,  the  cascading  failure  in  power  networks,  etc.  Regardless  of  the  network 
type  and  mechanism,  they  still  share  fundamental  properties:  (1)  the  root  cause  is  the 
influences/dependencies  between  nodes  of  one  or  more  networks,  (2)  the  process  often 
starts  from  a  small  group  of  nodes,  (3)  the  impact  is  high  due  to  the  large  number  of 
involved  nodes.  It  is  thus  crucial  to  study  these  processes  and  exploit  them  efficiently. 

In  this  work,  we  study  several  optimization  problems  relating  to  the  cascading 
process  in  networks.  In  particular,  we  mainly  focus  on  two  kinds  of  problems:  (1)  finding 
a  small  set  of  nodes  which  can  maximize  the  impact  through  the  cascading  process 
and  (2)  finding  a  set  of  nodes  with  minimum  size  which  causes  the  desired  impact. 
Depending  on  the  cascading  mechanism,  we  design  different  strategies  to  solve  the 
problem  efficiently  by  exploiting  both  the  properties  of  both  the  cascading  mechanism 
and  the  network  structure. 
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CHAPTER  1 
INTRODUCTION 

Nowadays,  the  world  is  more  and  more  connected  and  we  can  see  networks 
everywhere.  Networks,  from  power  substations  in  power  networks,  routers  in  communication 
networks,  users  in  online  social  networks,  etc.,  represent  the  interaction  between  entities 
and  play  crucial  roles  in  the  economy.  Power  networks  are  important  infrastructure 
networks  whose  malfunction  can  lead  the  change  or  stopping  of  almost  daily  activities. 

On  the  other  hand,  large-scaled  online  social  networks  are  easing  the  communication 
between  people  by  providing  a  platform  for  users  to  connect  and  keep  updating  from 
each  other.  Due  to  the  high  impact  of  these  networks  on  the  economy,  it  is  crucial  to 
study  phenomena  which  significantly  affect  activities  in  networks. 

As  entities  in  networks  interact  with  each  other,  the  cascading  propagation  is  one  of 
the  most  noticeable  phenomena  in  networks.  If  an  event  happens  at  a  particular  entity, 
it  can  triggers  events  at  other  entities  through  the  connection  between  entities  in  the 
network.  For  instance,  the  interaction  between  users  in  online  social  networks  serves 
as  the  medium  to  spread  information,  ideas,  and  influences.  Initially,  only  a  small  group 
of  users  aware  of  the  information  and  share  in  the  networks;  then,  the  information  is 
spread  to  their  friends,  friends  of  their  friends,  and  so  on.  As  a  result,  large  number  of 
users  will  aware  of  the  information  even  before  the  mass  media  broadcast  it  as  in  the 
case  of  Michael  Jackson’s  death  [1],  In  the  power  network,  the  cascading  propagation 
can  cause  severe  damage  by  multiplying  the  initial  failure.  In  2003,  the  initial  failure 
of  one  power  line  triggered  a  series  of  failures  which  resulted  in  the  outage  of  the 
majority  of  Italy  [47].  The  large-scaled  effect  of  the  cascading  propagation  inspires  us 
to  design  methods  to  exploit  its  positive  effects  and  prohibit  negative  effects.  However, 
the  cascading  mechanisms  are  various  in  networks  and  too  broad,  thus  we  focus  on 
investigating  the  cascading  propagation  in  following  settings: 
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1.1  Cascading  Failure  in  a  Network 

Networks  where  the  operation  of  a  node  strongly  depends  on  the  operation  of  other 
nodes  like  power  networks  are  extremely  vulnerable  under  cascading  failure.  In  these 
networks,  every  node  bears  dynamic  operational  load  depending  on  the  demand.  If  the 
demand  is  high,  the  nodes’  loads  are  high  and  may  reach  to  their  maximum  operational 
capacities.  In  general,  nodes  share  the  network  demand  so  that  each  of  them  can 
operate  under  the  permitted  capacity.  However,  when  some  nodes  are  malfunctioned  or 
failed,  they  shift  the  load  to  nearby  nodes  in  the  network.  These  nodes  may  be  forced 
to  work  beyond  their  capacities  so  they  are  overloaded  redistribute  their  load  onto  other 
nodes.  As  a  consequence,  a  large  number  of  nodes  may  be  overloaded  and  thereby  the 
network  halts  the  operation  entirely. 

As  the  failure  of  a  small  group  of  nodes  may  result  in  a  catastrophic  damage  on  the 
network  operation,  it  is  important  to  identify  such  groups.  These  nodes  are  critical  to  the 
operation  of  the  network,  thus  we  need  to  protect  them  from  being  attacked.  Although 
existing  literature  provides  various  vulnerability  assessment  of  networks  under  the 
cascading  failure,  there  is  still  lacking  of  efficient  solutions  for  this  problem.  These  works 
mainly  exploit  the  centrality  measurement  to  locate  most  critical  nodes  which  are  not 
enough  to  capture  the  complicated  interaction  of  nodes.  We  design  new  methods  which 
consider  both  network  structure  and  the  interaction  between  nodes  to  provide  better 
solutions. 

1.2  Cascading  Failure  in  Interdependent  Networks 

In  reality,  infrastructure  networks  are  interdependent  on  each  other  at  a  large 
degree.  The  power  stations  in  the  power  grid  consume  the  fuel  delivered  by  the 
transportation  network  to  generate  electricity  and  are  controlled  via  the  communication 
network.  If  the  transportation  network  or  the  communication  network  encounters  any 
problems,  the  operation  of  the  power  network  will  suffered.  On  the  other  hand,  the  power 
network  provides  electricity  for  routers  of  the  communication  network  and  electrical 
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vehicles.  Therefore,  the  failure  of  a  critical  group  of  nodes  in  any  network  may  result  in 
a  series  of  cascading  failures  across  the  system  and  cause  catastrophic  loss.  If  each 
network  is  treated  separately,  we  will  underestimate  the  vulnerability  of  networks. 

We  need  to  revisit  the  vulnerability  assessment  of  networks  taking  into  account 
the  effect  of  interdependencies  between  networks.  Let’s  consider  the  power  and 
communication  networks.  In  the  attacking  point  of  view,  an  attacker  can  analyze  the 
interdependencies  and  identify  nodes  whose  failures  trigger  a  large-sized  cascading 
process  back  and  forth  between  networks.  If  we  only  investigate  a  single  network, 
these  nodes  seem  to  be  scot-free  and  we  fail  to  protect  them.  Although  there  are  many 
efficient  methods  to  identify  critical  nodes  in  a  single  network,  it  is  still  lacking  ones  for 
interdependent  networks.  In  this  work,  we  propose  a  new  centrality  for  interdependent 
networks  which  can  be  used  to  locate  critical  nodes  efficiently. 

1.3  Influence  Diffusion  in  Multiple  Online  Social  Networks 

In  the  area  of  online  social  networks  (OSNs),  the  cascading  propagation  of 
information  transforms  networks  such  as  Facebook,  Google+,  and  Twitter  to  a  fruitful 
foundation  for  viral  marketing.  These  networks  equip  users  tools  to  connect  and  make 
new  friends,  to  share  opinions,  to  update  information  from  friends,  etc.,  thus  attract  a 
considerable  fraction  the  population  to  join  in.  In  return,  users  create  the  content  and 
circulate  the  information  at  a  level  that  has  been  achieved  before  by  any  of  previous 
communication  medium.  In  addition,  users  in  online  social  networks  also  incur  the 
same  peer-pressure  effect  as  the  reality,  i.e.,  in  which  an  individual’s  opinion  or  decision 
is  influenced  by  his  friends  and  colleagues.  These  factors  raise  a  practical  important 
problem  in  OSNs:  how  to  find  the  smallest  set  of  influencers  who  can  influence  a 
massive  number  of  users. 

A  noticeable  property  OSNs  is  the  overlapping  among  major  OSNs  which  has  a 
strong  impact  on  the  diffusion  of  information.  Since  a  user  can  share  the  information  in 
all  networks  which  he  participates  in,  the  influence  of  a  user  in  all  networks  is  significant 
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larger  than  in  any  network.  Therefore,  it  is  essential  to  evaluate  the  influence  of  users 
in  multiple  OSNs  to  identify  most  influential  ones.  However,  we  can  not  trivially  mitigate 
evaluation  methods  for  a  single  network  to  multiple  networks.  To  overcome  this  difficulty, 
we  propose  novel  schemes  to  couple  multiple  networks  into  one  network  reserving  all 
diffusion  information  and  solve  the  problem  in  the  coupled  network. 

1.4  Organization 

The  rest  of  the  work  is  organized  as  follows.  Chapter  2  studies  the  vulnerability 
of  power  networks  under  the  load  redistribution  model.  In  chapter  3,  we  present  the 
cascading  failure  model  for  interdepent  networks  and  propose  algorithms  to  detect 
critical  nodes.  Next,  chapter  4  investigates  information  diffusion  in  multiple  online  social 
networks.  Finally,  chapter  5  concludes  the  whole  thesis. 
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CHAPTER  2 

CASCADING  FAILURE  UNDER  LOAD  REDISTRIBUTION  IN  NETWORKS 

The  important  role  of  power  networks  in  the  economy  as  well  as  in  the  society 
has  attracted  a  great  deal  of  research  effort  to  analyze  the  vulnerability  of  these 
networks.  The  failure  or  malfunction  of  these  networks  can  cause  severe  effect.  On 
28  September  2003  [47],  the  wide  area  blackout  affected  the  major  of  Italy  and  made 
3/4  of  Italy  without  electricity  for  2  hours,  the  traffic  system  is  halted.  This  shows  that 
large  blackouts  are  not  rared  and  can  happen  everywhere.  Moreover,  it  implies  that 
intentional  attacks  can  cause  mass  damage  to  power  networks.  When  the  small  number 
of  components  are  attacked,  the  large  blackout  can  happen  in  a  very  short  time.  Thus, 
it  is  crucial  to  identify  most  vulnerable  components  of  the  power  network  so  that  we  can 
protect  in  advance. 

The  common  denominator  of  large  blackouts  is  that  the  failures  of  components 
happened  according  to  the  cascading  manner.  It  often  starts  with  the  failure  of  one  or  a 
few  components,  then  some  other  components  are  failed  due  to  the  dependencies  with 
previous  failed  components.  The  failure  of  these  components  continue  to  cause  other 
components  fail.  The  process  continues  until  there  is  no  more  failed  component.  The 
power  statiosn  which  are  nodes  in  the  power  network  can  only  work  well  if  the  load  is 
under  the  maximum  capacity  they  can  handle.  When  a  station  is  overloaded,  it  can  not 
work  with  the  best  performance  or  even  fails.  During  the  operation,  the  power  network  is 
designed  such  that  all  stations  work  under  their  capacity.  But  when  some  stations  fails, 
other  stations  which  are  directly  or  indirectly  connected  with  failed  ones  may  have  bear 
more  load.  If  the  load  of  a  station  surpasses  its  capacity,  it  will  fail  and  continue  to  shred 
its  load  to  other  stations.  As  a  result  of  the  load  redistribution  process,  a  large  number  of 
failed  stations  may  be  failed  at  the  end.  We  would  like  to  predict  the  process  so  that  we 
can  prevent  it,  but  the  dependencies  between  stations  make  it  difficult  to  do  so.  Thus,  it 
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is  necessary  to  develop  efficient  tools  to  analyze  the  sophisticated  cascading  process  of 
failures  in  power  networks. 

In  this  chapter,  we  study  the  critical  node  detection  problem  in  power  networks 
under  the  load  redistribution  of  nodes.  Specifically,  we  aim  to  find  the  set  of  k  nodes 
whose  failures  maximize  the  number  failed  nodes  after  the  cascading  failure.  We 
design  two  efficient  algorithms  to  solve  problem:  adaptive  cascading  potential  algorithm 
and  cooperating  attack  algorithm.  The  efficiency  of  each  algorithm  depends  on  the 
topological  structure  of  the  network,  hence  they  compensate  each  other  to  solve  the 
problem. 

The  rest  of  the  chapter  is  organized  as  follows.  We  first  present  the  load  redistribution 
model  and  problem  formulation  in  Section  2.1.  Section  2.2  shows  the  hardness  result. 
After  that,  we  propose  the  cascading  potential  metric  and  design  various  algorithms  in 
Section  2.3.  We  next  introduce  the  cooperating  algorithm  which  is  efficient  on  robust 
networks  in  Section  2.4.  Section  2.5  shows  the  experimental  evaluation.  Finally,  we 
review  the  literature  in  Section  2.6  and  summerize  the  chapter  in  Section  2.7. 

2.1  Network  Model  and  Problem  Formulation 

2.1.1  Graph  Notations 

The  network  is  modeled  by  a  weighted  directed  graph  G  —  (V,  E)  with  vertex 
set  V  of  \  V\  =  n  vertices  and  edge  set  E  of  \E\  —  m  oriented  connections  between 
vertices.  Each  edge  (u,  v)  is  associated  with  a  weight  w(u,  v )  presenting  the  operating 
parameter  of  the  network.  The  higher  w(u,  v )  is,  the  more  load  is  distributed  from  u  to  v. 

In  addition,  each  vertex  u  has  the  current  load  L(u)  and  a  capacity  C(tv).  The  capacity 
C(u )  is  the  maximum  load  that  vertex  u  can  accept.  We  denote  the  set  of  incoming 
neighbors,  outgoing  neighbors  of  u  by  N~  and  A/+,  respectively. 

2.1.2  Cascading  Failure  Model 

In  the  Load  Redistribution  model  (LR-model)  [56]  [53],  nodes  are  failed  in  the 
cascading  manner  due  to  the  load  redistribution  of  failed  nodes.  Initially,  a  set  of  nodes 
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S  are  failed,  then  the  failures  are  propagated  to  other  nodes  in  time  steps.  When  node 
u  fails,  its  load  is  redistributed  to  its  neighbors  as  illustrated  in  Fig.  2-1 .  Each  alive 
neighbor  will  received  an  additional  load  which  is  proportional  to  its  weight.  Precisely, 
each  neighbor  v  of  u  will  receive  additional  load: 


A L(v)  =  L(u )  x 


w(u,  v) 


Eze/v+wO.z) 

Due  to  the  load  redistribution,  the  load  of  some  nodes  are  exceeding  their 
capacities,  hence  fail  in  the  next  time  step.  The  process  of  load  redistribution  and 
node  failing  will  stop  when  there  are  no  more  failed  nodes.  The  set  of  failed  nodes 
caused  by  the  initial  failure  of  S  is  denoted  by  F(S). 


Figure  2-1.  When  node  u  fails,  its  load  is  redistributed  to  the  neighbor  nodes.  Among 
these  node,  v  receives  a  high  portion  of  load  from  u  and  becomes 
overloaded.  The  load  of  v  is  redistributed  to  its  neighbors  which  makes  z  fail. 
Finally,  the  load  from  z  continues  to  cause  w  fail  and  the  process  stops. 
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2.1.3  Problem  Definition 


Due  to  the  cascading  failures,  the  failures  of  a  small  set  of  nodes  S  can  result  in 
a  catastrophic  number  of  failed  nodes.  These  nodes  becomes  the  target  to  attack  the 
network.  Additionally,  given  the  same  set  of  attacked  nodes,  different  attacking  orders 
lead  to  different  outcomes.  With  the  same  attacking  cost,  the  attacker  can  choose  the 
best  order  with  suitable  time  for  each  attacked  node.  However,  the  cascading  failures 
happen  very  fast,  it  is  almost  impossible  to  schedule  the  failure  of  each  node  with 
specific  time  steps.  We  consider  a  more  practical  strategy  in  which  target  nodes  are 
attacked  one  by  one.  The  next  node  is  taken  down  when  the  cascading  process  stops. 

In  particular,  given  an  order  set  S  =  {si,  s2 _ sk},  the  set  of  fails  after  s,  is  attacked  is 

F,-(S)  =  F(F,_i(S)  u  {s(}).  Denote  F+(S)  as  Fk(S),  the  set  of  failed  nodes  when  nodes 
in  S  is  attacked  serially.  We  formally  define  the  problem  as  follows. 

Definition  1  (Cascading  Critical  Node  Problem  (Cas-CNP)).  Given  a  network  G  — 

(V,  E)  and  an  integer  k,  the  problem  asks  to  find  a  ordered  subset  S  c  V  of  size  5  =  k 
such  that  the  serial  failures  of  nodes  in  S  maximizes  the  number  of  failed  nodes  F+(S) 
under  the  LR-model. 

2.2  Inapproximability  Result 

In  this  section,  we  show  the  algorithmic  hardness  of  the  Cas-CN  problem.  We 
expect  to  design  an  algorithm  that  can  identify  the  optimal  seed  set  in  an  acceptable 
time.  However,  it  may  take  the  time  as  an  exponential  funtion  of  the  number  of  nodes  to 
compute  even  a  set  whose  impact  is  close  to  the  optimal  set’s.  The  hardness  result  is 
shown  in  Theorem  2.1. 

Theorem  2.1 .  It  is  NP-hard  to  approximate  the  CasCN  problem  within  ratio  of  0(n1_f) 
for  any  constant  1  >  e  >  0. 

Proof.  We  use  the  gap-introduction  reduction  [51]  to  prove  the  inapproximability  of 
the  CasCN  problem.  Using  a  polynomial  time  reduction  from  Set  Cover,  to  the  CasCN 
problem,  we  show  that  if  there  exists  a  polynomial  time  algorithm  that  approximates  the 
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later  problem  within  0(/71_e),  then  there  exists  a  polynomial  time  algorithm  to  solve  the 
former  problem. 

Definition  2  (Set  Cover  problem).  Given  a  universe  U  =  {ei,  e2 _ en},  a  collection 

of  subsets  S  =  {Si,S2 _ Sm}  c  2U,  and  an  integer  k,  the  Set  Cover  problem  asks 

whether  or  not  there  are  k  subsets  whose  union  isU. 

Instead  of  using  the  hardness  result  of  the  general  Set  Cover  problem,  we  use  the 
result  on  a  restricted  variant  MIN3SC2  of  the  Set  Cover  problem  where  the  sizes  of 
subsets  are  at  most  3  and  each  element  appears  in  exactly  two  subsets. 

Theorem  2.2  ([20]).  The  Set  Cover  problem  is  NP-hard  even  when  the  sizes  of  subsets 
are  bounded  by  3  and  each  element  appears  in  exactly  two  subsets. 

Reduction.  Given  an  instance  of  the  Set  Cover  problem  1  —  (U,  S,  k)  where  each 
element  appears  in  exactly  two  subsets,  n1  =  \U\  and  mx  =  |<S|,  we  construct  an  instance 
T  of  the  CasCN  problem  as  illustrated  in  Fig.  2-2. 


L(Uj)=lSj  I 
C(Uj)  =  1  +  LtUj ) 
u 


Uv;)  =  d  -1 


C(v,)  =  d  -0.5 


-1  + 


h. 

d 


Figure  2-2.  Reduction  from  MIN3SC2  to  CasCN 
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The  vertex  set  V.  Add  a  set  vertex  u,  for  each  set  S,  e  S,  an  element  vertex  y  for 

each  element  ej  e  U,  and  d  =  (ni  +  mi)?  extra  vertices  qlt  q2 _ qd- 

The  edge  set  E.  Add  edge  (uh  y)  if  the  element  vj  in  the  set  S,.  In  addition,  there  is 
an  edge  from  y  to  qp  VI  <  j  <  nlt  1  <  p  <  d.  All  edges  have  the  weight  of  1 . 

Vertex  load  and  capacity.  The  load  and  capacity  of  the  set  vertex  u,  are  L(ui)  =  \S,\ 
and  C(ui)  =  1  +  L(u,).  The  load  and  capacity  of  the  element  vertex  v ,■  are  L(y)  =  d  -  1 
and  C(vj)  =  d  -  0.5.  All  extra  vertices  have  the  load  of  0  and  capacity  of  n1  -  1  + 

Next,  we  prove  that  if  X  has  a  set  cover  of  size  k  then  there  exist  a  seeding  set 
Ac  V  such  that  |F(A)|  >  d.  Otherwise,  for  a\\  Ac  V  and  |A|  <  k,  |F(A)|  <  n1  +  mx. 

Assume  that  X  has  a  set  cover  SC  of  size  k,  then  we  will  select  the  set  A  =  {u,\S,  e 
SC}  as  the  seeding  set.  Initially,  each  vertex  u,  e  S  redistributes  1  unit  of  load  to  each  of 
its  \Si\  neighbors.  Since  each  element  is  covered  by  at  least  one  set  in  SC,  each  vertex 
Vj  receives  the  additional  load  of  at  least  1.  At  the  next  round,  y  has  the  load  at  least 
L(vj)  >  (d  -  1)  +  1),  which  is  higher  than  its  capacity,  and  is  failed.  When  v ,■  fails,  it 
equally  redistributes  L(vj)/d  >  1  load  to  its  d  extra  neighbor  vertices.  The  load  of  each 
extra  vertex  qp  after  receiving  load  from  all  failed  element  vertices  is  L(qp)  >  n  >  C(qp), 
hence  all  extra  vertices  are  failed.  The  cascading  process  stops  with  |F(A)|  =  d  +  n1  +  k 
failed  vertices. 

In  the  case  X  has  no  set  cover  of  size  k,  we  will  show  the  optimal  seeding  set  can 
cause  at  most  n1  +  k  nodes  fail  in  X'.  Let  A  be  an  arbitrary  optimal  seeding  set.  We 
observe  that  a  set  vertex  only  fails  when  it  is  selected  in  the  seeding  set  since  it  has 
no  incoming  edge.  Thus,  there  are  at  least  m1  -  k  set  vertices  which  are  not  in  A.  We 
can  replace  any  extra  vertex  qp  e  A,  if  there  are  any,  by  a  unselected  set  vertex  without 
decreasing  the  number  of  failed  nodes.  Next,  suppose  that  there  exists  an  element 
vertex  v ,■  e  S,  we  can  also  replace  it  by  a  set  vertex.  If  v ,  is  adjacent  to  some  vertex 
Uj  e  A,  we  can  remove  y  from  A  while  maintaining  the  same  number  of  failed  nodes. 

If  y  is  not  adjacent  to  any  vertex  in  A,  we  just  replace  y  by  one  of  its  neighborhood 
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set  vertex  u,.  u,  will  make  vj  fail,  so  the  number  of  failed  nodes  caused  by  A  is  not 
decreased.  So,  we  can  replace  extra  and  element  vertices  in  A  such  that  A  contains 
only  set  vertices.  Since,  there  is  no  set  cover  of  size  k,  there  at  least  one  vertex  is  not 
adjacent  to  any  vertex  in  A  ,i.e,  the  number  of  failed  element  vertices  is  at  most  nl  -  1. 
Each  failed  element  vertex  vj  is  adjacent  to  at  most  2  set  vertices,  hence  its  load  is  at 
most  L(vj)  <  d  +  1.  Each  extra  vertex  qp  receives  at  most  (of  +  l)/of  redistributed  load 
from  failed  element  vertices  which  are  accumulated  to  at  most: 


l)(of  + 1)  _  r?i  —  1  _  ni  _ .  . 

— - —  n1  —  1  - —  <  n1  -  1  +  —  =  C(qp) 


Thus,  there  is  no  extra  vertex  fails.  The  total  failed  nodes  caused  by  A  is  at  most 

/7i  +  k  <  iii  +  mi. 

Now  suppose  that  we  have  polynomial  algorithm  A  which  approximates  CasCN 
problem  within  n1-e,  we  can  decide  the  set  cover  problem  as  follows.  For  any  instance  X 
of  the  Set  Cover  problem,  we  construct  the  instance  X'  as  above  in  polynomial  time  as 
d  is  a  polynomial  function  of  n1  and  mx.  Now,  if  X  has  a  set  cover  of  size  k,  the  optimal 
Aopt  seeding  set  causes  at  least  d  vertices  fail  in  T.  The  algorithm  A  approximate  the 
optimal  solution  within  n1~e  (, n  —  nx  +  m1  +  d,  the  number  of  vertices  in  T),  so  it  finds  a 
seeding  set  A(T)  whose  causes  at  least  (mi  +  nx)  vertices  fail: 


On  the  other  hand,  if  X  has  no  set  cover  of  size  k,  then  the  optimal  seeding  set  Aopt  of  X' 
causes  less  than  (mx  +  n i)  vertices  fail.  We  have: 


\F(A(X')\>\F(Aopt)\<(mi  +  ni) 
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It  implies  the  X  has  a  set  cover  of  size  k  if  and  only  if  \F(A(T)\  >  {m i  +  n^.  Hence,  we 
can  use  A  to  decide  the  Set  Cover  problem  in  polynomial  time  i.e.  P  —  NP.  □ 

2.3  Cascading  Potential  and  Derived  Algorithms 

In  this  section,  we  introduce  a  new  metric  to  measure  the  node  importance  under 
the  cascading  failure,  and  then  apply  it  to  design  efficient  algorithms  for  CasCN.  To 
evaluate  the  vulnerability  of  networks  under  the  load  redistribution,  previous  works  in 
the  literature  propose  various  ranking  methods  and  measure  the  effect  of  attacking 
top  k  nodes.  However,  these  methods  consider  very  limited  topological  information, 
hence  may  miss  the  most  critical  nodes.  In  [7,  36,  53],  the  authors  solely  use  the  load 
as  the  criterion  to  rank  nodes.  The  failure  of  a  high  load  node  intuitively  tends  to  cause 
a  large  number  of  nodes  fail  as  it  redistributes  a  large  amount  of  large  to  its  neighbors, 
but  cascading  failures  started  from  a  small  load  node  at  the  right  position  may  result 
in  a  larger  number  of  failed  nodes  [54],  Wang  et  al.  [54]  overcome  this  shortcoming  by 
directly  assessing  the  effect  of  the  cascading  process,  the  number  of  failed  nodes,  which 
is  triggered  by  the  evaluated  node.  Nevertheless,  the  direct  impact  is  one  the  top  factors, 
they  fail  to  incorporate  the  indirect  impact  into  the  node  importance.  When  multiple 
nodes  are  attacked  in  the  network,  the  indirect  impact  of  a  node  is  the  base  for  the  direct 
impact  of  other  nodes.  Next,  we  introduce  a  new  metric  which  considers  both  direct  and 
indirect  impact  of  the  node. 

2.3.1  Cascading  Potential 

The  cascading  potential  of  a  node  is  defined  as  combination  of  all  possible  impacts 
a  node  causes  in  the  network  under  the  cascading  effect.  Let’s  consider  the  failure  of 
node  u.  For  any  other  node  v,  there  are  two  possible  impacts  that  u  can  induce  on  v: 

•  Failure  impact.  The  failure  of  u  leads  to  the  failure  of  v. 

•  Load  impact.  The  failure  of  u  makes  the  load  of  v  increase  but  not  enough  to  fail. 
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The  overall  failure  impact  and  load  impact  of  u  in  the  network  are  defined  as  the 
number  of  failed  nodes  and  the  total  of  increased  load  of  unfailed  nodes,  respectively. 
The  cascading  potential  of  u  is  the  linear  combination  of  these  factors: 

r(  x  l/r({u))l  ,  Eug hLu(v) 

n  +  Zvev-Fm)(C(v)  -  L(v)) 

where  F({u})  is  the  set  of  failed  nodes  when  u  fails  and  A Lu(v)  is  the  additional  load 
that  v  receives  due  to  the  failure  of  u. 

In  this  formula,  we  normalize  both  the  failure  and  load  impacts  to  avoid  the 
unit  difference.  The  failure  impact  is  divided  by  the  number  of  nodes,  hence  is  at 
most  1  when  all  other  nodes  fail.  Similarly,  the  load  impact  is  divided  by  the  total  of 
capacity-load  difference  of  unfailed  nodes  and  achieves  the  maximum  value  1  when  all 
remained  nodes  are  at  the  edge  of  failure,  i.e.,  the  most  vulnerable  state  of  the  network. 

The  role  of  the  load  impact.  In  the  formulation  of  the  cascading  potential,  the 
load  impact  plays  an  important  role  to  provide  a  better  assessment  of  the  network 
vulnerability  comparing  to  the  metric  in  [54].  If  only  one  node  is  attacked,  it  is  obviously 
to  choose  the  node  which  maximizes  the  number  of  failed  nodes,  i.e.,  to  use  Wang 
et  al.’s  metric.  However,  when  multiple  nodes  are  attacked,  we  need  to  consider  the 
co-impact  of  attacked  nodes  to  trigger  a  large  size  cascading  failure.  The  load  impact  is 
bridge  connecting  the  impact  of  these  nodes  since  the  load  impact  of  a  node  is  the  base 
for  the  failure  impact  of  other  nodes.  For  example,  if  u  has  the  maximum  load  impact  of 
1  and  the  network  is  strongly  connected,  then  attacking  any  node  after  u  can  take  down 
the  whole  network.  Thus,  the  cascading  potential  evaluate  the  importance  of  nodes 
more  comprehensively. 

2.3.2  Cascading  Potential  Algorithm 

Intuitively,  we  can  use  cascading  potential  directly  to  design  an  algorithm  for 
CasCN.  We  first  compute  the  cascading  potential  of  all  nodes,  then  select  top  k  as 
attacked  nodes.  The  algorithm  is  described  in  Algorithm  1. 
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Algorithm  1  Cascading  Potential  Algorithm 
Require:  A  network  G  —  (V,  £),  an  integer  k. 

Ensure:  A  set  S  of  k  attacked  nodes. 

Compute  the  cascading  potential  of  all  nodes 

Sort  nodes  in  non-increasing  order  of  the  cascading  potential  C(t/i)  >  C{u2)  >  ...  > 
C(un) 

Initialize  S  -e-  0 

J  1 

for  /  =  1  to  k  do 
•S  i —  S  U  {Ui} 

end  for 
Return  S 


Time  complexity.  It  takes  at  most  0{m )  to  compute  the  cascading  potential  of  each 
node.  Thus,  the  total  running  time  is  0(nm  +  n  log  n). 

2.3.3  Adaptive  Cascading  Potential  Algorithm 

The  Cascading  Potential  algorithm  runs  fast,  but  it  neglects  an  important  property  of 
the  cascading  failure:  the  overlapped  impact  of  selected  nodes.  Let  consider  two  nodes 
u  and  i/  which  both  have  failure  impact  on  node  z.  If  u  is  selected  before  v,  then  v  has 
no  impact  z  as  z  is  already  failed.  As  a  consequence,  some  nodes  have  high  impact 
initially  will  have  small  impact  at  the  late  of  the  selection  process.  We  can  improve  the 
performance  of  the  algorithm  by  updating  the  impact  of  remained  nodes  on  the  fly.  More 
specifically,  at  the  ith  iteration,  the  impact  (failure  or  load  impact)  of  node  u  on  failed 
nodes  (due  to  the  selection  of  first  /  -  1  attacked  nodes  )  will  be  subtracted  from  the 
initial  impact  of  u.  After  that,  the  node  with  highest  remained  impact  will  be  selected. 

The  crucial  problem  is  how  to  update  the  impact  of  nodes  efficiently.  We  may 
naively  keep  the  list  of  impacted  nodes  for  each  node  u.  At  each  iteration,  we  compare 
the  list  of  impacted  nodes  and  the  set  of  failed  nodes  to  update  the  subtract  the  impact 
on  failed  nodes.  This  can  result  in  Q(n3)  running  time  for  each  iteration  which  is  very 
time  consuming.  We  reduce  the  updating  time  by  reversing  the  process.  Each  node  v 
will  keep  two  lists  of  nodes:  the  list  Fl[v ]  contains  nodes  which  have  failure  impact  on  v 
and  the  list  Ll(v)  contains  nodes  which  have  load  impact  on  v.  Since  the  load  impact 
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Algorithm  2  Adaptive  Cascading  Potential  Algorithm 


Require:  A  network  G  =  (V,  £),  an  integer  k 
Ensure:  A  set  S  of  k  attacked  nodes, 
for  each  v  e  V  do 

Initialize  Fl[v]  0,  Ll[v]  0 

end  for 

for  each  u  e  \/  do 
Compute  C{u ) 
for  each  v  e  F({u})  do 
F/[v]  <-  Fl[v]  U  {u} 

end  for 

for  each  v:  A Lu(v)  >  0  and  v  F({u})  do 
Ll[v][u]  4-  - AL"(r)f 

L  JL  J  52z£V-F({u})(C(z)~L(z)) 


end  for 
end  for 

Initialize  S  <-  0 
for  each  u  e  V  do 


Mark[u\  -e-  False 

end  for 

for  /'  =  1  to  k  do 


u  4-  arg  maxue\/\F+(s){C(v)} 

S  <-  S  U  {u} 

for  each  v  e  F(S)  do 

if  Mar/c  [v]  ==  False  then 
Mar/c  [v]  4—  True 
for  each  u  e  Fl[v]  do 
C(u)  <-  C(u)  —  1/|  V\ 
end  for 

for  each  u  e  Ll[v]  do 
C(u)  4-  C(u)  -  CL[v][u] 

end  for 
end  if 
end  for 
end  for 
Return  S 


of  other  nodes  on  v  are  different,  we  use  Ll[v][u]  to  store  the  load  impact  of  u  on  v 
after  the  normalization.  When  v  is  failed,  the  impact  of  nodes  in  its  lists  will  be  updated. 
The  crucial  point  is  that  each  node  only  fails  once,  thus  the  running  time  is  reduced 
significantly.  The  algorithm  with  adaptive  cascading  potential  is  described  in  Algorithm  2. 


25 


Time  complexity.  Since  each  node  has  impact  on  at  most  n  nodes,  the  total  size  of 
all  FI  and  LI  lists  are  at  most  n2.  The  number  of  updates  is  bounded  by  the  total  size  of 
FI  and  LI  lists.  Therefore  the  total  running  time  is  0(nm  +  n2  +  kn ). 

2.3.4  Fully  Adaptive  Cascading  Potential  Algorithm 

On  the  line  of  cascading  potential  based  algorithms,  we  continue  to  improve  the 
solution’s  quality  by  spending  more  time  to  calibrate  the  cascading  potential  of  nodes. 
After  a  node  u  is  attacked,  the  network  state  is  changed  with  new  failed  nodes  and 
load  updates;  and  this  may  decrease  (as  discussed  in  the  proceeding  part)  or  increase 
the  impact  of  a  node.  The  failure  of  u  adds  load  to  many  nodes  and  makes  them  more 
vulnerable.  Although  the  impact  of  a  remained  node  v  is  deducted  by  the  impact  on 
failed  nodes,  it  can  still  increase  since  other  nodes  are  easier  to  be  failed.  We  can  fully 
update  the  cascading  potential  of  each  node  as  follows.  After  selecting  a  new  node,  we 
simulate  the  cascading  failure  triggered  by  it  and  obtain  a  new  graph  of  remained  nodes 
In  this  graph,  the  load  of  a  node  is  the  load  when  the  cascading  process  stops.  We  then 
can  evaluate  the  cascading  potential  of  all  nodes  in  the  updated  graph  and  select  one 
with  highest  value.  We  present  the  algorithm  in  Algorithm  3. 


Algorithm  3  Fully  Adaptive  Centrality 
Require:  A  network  G  —  (V,  E)  and  an  integer  k. 

Ensure:  A  set  S  of  k  attacked  nodes. 

Initialize  S  <-  0 
for  /  =  1  to  k  do 

Compute  the  cascading  potential  of  all  nodes  in  G 
Select  u  as  the  node  with  highest  cascading  potential 

s^suH 

Update  node  loads  and  remove  all  failed  nodes  in  G  with  the  failure  of  u 

end  for 

Return  S 


Time  complexity.  We  need  to  compute  the  cascading  potential  of  all  nodes  to 
select  a  new  one  with  time  O(nm).  Thus  the  total  running  time  is  O(kmn).  However,  the 
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algorithm  may  run  much  faster  than  the  worst  case  time  since  the  size  of  the  updated 
graph  decreases  when  a  new  node  is  selected. 


2.4  Cooperating  Attack  Algorithm 

The  key  to  connect  the  impact  of  multiple  nodes  in  the  above  algorithms  is  the 
load  impact  which  is  a  connection  link  when  the  network  is  robust.  In  this  case  where 
nodes  are  high  failure  tolerant,  i.e.,  the  gap  between  the  capacity  and  load  is  big,  the 
failure  impact  of  each  node  is  small.  Thus,  nodes  with  high  load  impacts  tend  to  be 
selected. concentrate  If  the  load  of  these  nodes  are  scattered  to  many  nodes,  they 
are  not  linked  together  to  make  other  nodes  fail.  As  a  consequence,  there  are  a  large 
number  of  nodes  whose  loads  are  increased,  but  there  is  only  a  few  failed  nodes.  We 
incidentally  try  to  maximize  the  total  load  impact  instead  of  the  failure  impact  -  the 
objective  function.  We  need  a  better  strategy  which  builds  a  strong  connection  between 
selected  nodes  to  increase  the  number  of  failed  nodes.  To  fulfill  this  goal,  the  new 
strategy  should  satisfy  following  features: 

•  The  redistributed  load  of  selected  nodes  should  be  concentrated  on  certain  nodes 
to  fail  them.  If  early  selected  nodes  redistributed  load  to  a  set  of  nodes,  then  later 
selected  nodes  should  also  redistribute  load  to  this  set.  It  is  said  that  selected 
nodes  are  cooperating  in  redistributing  load  to  make  more  nodes  fail. 

•  Selected  nodes  should  cooperate  to  make  high  load  nodes  fail.  The  failure  of  high 
load  nodes  can  expand  the  cascading  failure  further.  However,  if  high  load  node 
preference  reduces  the  number  of  failed  nodes,  the  new  strategy  should  avoid 
blindly  favoring  to  fail  high  load  nodes. 

We  design  a  new  evaluation  function,  the  efficiency,  of  nodes  with  properties  that 
tailor  the  selection  process  to  embrace  both  desired  features.  Firstly,  we  give  higher 
evaluation  to  nodes  which  redistributes  its  load  to  load-increased  nodes.  If  the  failure  of 
u  pushes  an  additional  load  A Lu(v)  on  v,  then  the  impact  of  u  on  v  is  defined  by: 


7 (u,  v ) 


A Lu(v) 
C(v)-L(v) 
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when  A Lu(v)  +  L(v)  <  C(v).  Since  it  requires  C(v)  -  L(v)  additional  load  to  make  v  fail, 
we  can  interpret  that  u  makes  a  fraction  cnk=rh)  of  ^  fail- 

The  new  impact  function  implies  that  if  the  more  load  a  node  receives,  the  more 
likely  the  new  selected  node  will  redistribute  load  on  it.  On  the  other  hand,  the  evaluation 
of  u  is  higher  if  the  loads  of  its  neighbors  are  increased.  This  implication  is  stated  in 
Proposition  2.1 . 

Proposition  2.1 .  For  any  node  v  at  two  points  of  time,  if  v  receives  more  load  at  the 
second  time  point,  i.e.,  L2(v )  >  Z_i(v),  then  the  impact  of  other  node  u  with  the  same 
redistributed  load  A L  is  higher  at  the  second  time  point:  7 2(u,v)  >  7i (u,  v)- 

Proof.  We  have: 


72 (u,  V ) 


A  L  A  L 

C{u )  -  L2(u)  >  C(u )  -  U(u) 


71O,  v) 


□ 


Note  that  we  assume  u  redistribute  the  same  load  on  v  in  the  Proposition  2.1 ,  i.e. 
the  load  of  u  is  the  same  at  two  points  of  time.  In  fact,  the  load  of  u  may  increase  due 
to  the  selection  of  previous  nodes,  thus  the  evaluation  of  u  even  increases  more  at  the 
second  point  of  time. 

To  fulfill  the  second  feature,  we  assign  higher  values  to  high  load  nodes  which  are 
impacted.  The  value  of  a  node  with  load  L  is: 


The  function  a(L)  is  monotone  increasing  and  in  the  range  0.5  <  u{L)  <  1.  The 
monotone  increasing  of  the  function  shows  the  preference  toward  high  load  nodes. 
Recall  that  the  main  goal  is  to  increase  the  number  of  failed  nodes,  so  even  nodes  with 
the  lowest  load  have  the  value  at  least  half  of  the  highest  value  nodes. 

Next,  we  will  define  the  efficiency  of  selecting  u  via  the  impact  on  v.  Intuitively, 
u  makes  7 (u,  v)  fraction  of  v  fail  and  v  has  value  of  cr(L(v)),  thus  the  efficiency  of  u 
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represented  on  v  is: 


A (u,  v)  —  7 (u,  v)a(L(v)) 

Finally,  we  obvious  should  take  into  account  the  number  of  failed  nodes  when 
evaluating  node  u.  The  overall  efficiency  of  u  is  the  total  of  the  number  of  failed  and  the 
efficiency  on  unfailed  nodes: 

\(u)  =  |F(M)|  +  £  A  (u.v) 

v€V\F{{u}) 

The  efficiency  evaluation  shows  several  notable  properties  which  servers  our  design 
goal  as  followings: 

Increase  the  number  of  failed  nodes  first.  If  u  makes  z  fail  and  has  efficiency  A (u,  v) 
on  the  unfailed  node  v,  then  the  contribution  of  z  to  the  overall  efficiency  of  u  is  always 
higher  than  v  since  1  >  i(u,  vML(v)). 

Avoid  redistributing  load  to  impossible-to-fail  nodes.  If  node  v  needs  too  much 
additional  load  before  failing,  it  will  be  ignored  in  efficiency  evaluation  of  nodes  as  stated 
in  the  Proposition  2.2. 

Proposition  2.2.  Given  two  nodes  u  and  v  with  fixed  load  L(  v),  the  efficiency  of  u  on  v 
is  monotone  decreasing  and  goes  to  0  when  the  capacity  C(v)  of  v  increases  and  goes 
to  infinity. 

Proof.  It  is  easy  to  see  that  7 (u,  v)  is  monotone  decreasing  and  goes  to  0  when  C(v) 
increases  and  goes  to  infinity.  In  addition,  a(L(u))  is  a  constant,  so  the  efficiency 
A (u,  v)  =  7 (u,  v)a(L(v))  decreases  and  goes  to  0.  □ 

Not  favoring  high  load  nodes  with  all  cost.  We  consider  the  case  the  capacity  is 
linear  to  the  load,  a  common  setting  in  the  reality  to  guarantee  the  safety  of  nodes. 

In  this  case,  even  the  load  of  node  v  is  extremely  large,  it  is  still  ignored  as  shown  in 
Preposition  2.3. 
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Proposition  2.3.  Suppose  that  the  capacity  C(v)  is  linear  to  the  load  C(v)  =  T  *  L(v) 
with  constant  factor  T.  Then,  the  efficiency  of  any  node  u  on  v  goes  to  0  when  the  load 
L(v)  goes  to  infinity 


Proof.  We  have: 


T(u,v)a(L(u))  =  c^v)  i+X) 


Mv) 


< 


A Lu(v) 


( T— l)L(v) 

The  function  goes  to  0  when  L(u)  goes  to  infinity. 


□ 


Based  on  the  efficiency  evaluation,  we  propose  the  Cooperating  Attack  (CA) 
algorithm  with  the  same  manner  of  Fully  Adaptive  Cascading  Potential  algorithm.  The 
algorithm  also  selects  nodes  one  by  one.  After  updating  the  state  of  the  network,  the 
node  with  highest  efficiency  is  selected.  The  whole  algorithm  is  described  in  Algorithm 
4. 


Algorithm  4  Cooperating  Attack  (CA)  Algorithm 
Require:  A  network  G  =  (V,  E)  and  an  integer  k. 

Ensure:  A  set  S  of  k  seed  nodes. 

Initialize  S  <-  0 
for  /  =  1  to  k  do 

Evaluate  the  efficiency  of  all  nodes  in  G 
Select  u  as  the  node  with  the  highest  efficiency 

S  i —  S  U  {u} 

Update  node  loads  and  remove  all  failed  nodes  G  with  the  failure  of  u 

end  for 

Return  S 


Time  complexity  Similarly  to  the  Fully  Adaptive  Cascading  Potential  algorithm,  the 
total  running  time  is  O(kmn). 

2.5  Experimental  Evaluation 

In  this  section,  we  demonstrate  the  experimental  results  on  both  synthesized  and 
real  power  networks.  We  first  test  the  performance  of  the  proposed  algorithms  in  the 
comparison  with  current  attacking  strategies  in  the  current  literature  [53,  54],  These 
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strategies  sort  nodes  based  on  some  criterion  and  select  top  k  nodes  as  attacked 
nodes.  The  sorting  criteria  are: 

•  Largest  load  (HL). 

•  Lowest  load  (LL). 

•  Highest  percentage  of  failure  (PoF).  The  percentage  of  failure  of  a  node  u  is  the 
fraction  of  nodes  is  failed  when  u  fails. 

•  Highest  risk  if  failure  (RIF).  RIF  of  a  node  u  is  the  ration  between  its  load  and  the 
total  load  of  its  neighbor  nodes. 

We  omit  the  required  redundancy  of  Wang  et  al.  [54]  since  it  provides  the  same 
order  of  nodes  as  RIF.  After  that,  we  evaluate  the  robustness  of  networks  under  different 
failure  tolerance  schemes  to  identify  the  suitable  network  design  considering  the  effect 
of  cascading  failures. 

2.5.1  Datasets 

Real  network.  We  use  the  Western  North  American  (WNA)  power  grid  network  [55] 
with  4941  substations  and  6594  transmission  lines  to  run  experiments.  However,  the 
dataset  is  lacking  of  load  and  capacity  information  of  nodes,  thus  we  use  the  method 
in  [56]  to  assign  the  load  and  capacity  for  each  node.  The  initial  load  of  of  node  u  is 
given  by  L(u)  =  d(u)p  where  d(u)  is  the  degree  of  u  and  a  is  a  tunable  parameter.  This 
assignment  method  is  reasonable  as  the  load  of  the  node  is  shown  to  be  scaled  with  its 
degree  [59].  Vertex  capacities  are  assigned  based  on  three  different  schemes. 

Normal  networks.  In  normal  networks,  the  capacity  C(u)  of  each  node  u  is 
proportional  to  its  initial  load  L(u): 


C(u)  =  T*L(u ) 

where  T  is  a  constant  representing  the  system  tolerance.  The  higher  T  is,  the  more 
endurance  the  network  is  under  the  cascading  failure. 
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Safe  networks.  In  safe  networks,  the  node  capacities  are  assigned  in  two  phases. 
First,  the  capacity  C(u )  of  each  node  u  is  scaled  as  the  normal  network,  i.e: 

C(u)  =  T*L{u ) 

Then,  then  capacities  of  all  nodes  are  raised  to  satisfy  the  N-l  failure  tolerance  criterion 
in  which  the  failure  of  any  node  will  cause  no  additional  failed  nodes.  It  means  that  any 
node  u  will  not  fail  when  it  receives  the  redistributed  load  from  any  of  its  neighbor.  The 
capacity  of  u  will  be: 


C(u)  —  max{C(tv),  max  {L(u)  +  w(v,  u)L(v)}} 

veN~(u) 

Scaled  Safe  networks.  In  contrast  to  safe  networks,  scaled  safe  networks  are 
formed  by  raising  the  node  capacities  to  satisfy  N-l  failure  tolerance  criterion  first,  then 
be  scaled  up  later.  In  particular,  the  network  is  made  safe  by  assigning  the  capacity  of 
each  node  u  as: 

C(u)  —  max  { L{u )  +  w(v,  u)L(v)} 

veN-(u) 

Then  scale  up  the  capacity  of  u  to  C(u)  =  T  *  C(u). 

Synthesized  Networks.  We  also  run  the  experiments  on  synthesized  networks 
generated  by  Erdos-Renyi  random  network  model  [26].  Each  network  has  5000  vertices 
with  the  average  degree  of  4  or  8.  The  other  parameters  of  the  network  is  generated 
similar  to  above  schemes. 

2.5.2  The  performance  of  Different  Algorithms 

We  first  compare  the  performance  of  proposed  algorithms  and  the  previous  works. 
We  run  experiments  on  networks  with  different  system  tolerance  values  and  /3  =  1. 

The  results  are  shown  in  Fig.  2-3,  2-4,  and  2-5.  When  networks  have  small  tolerance 
values,  they  are  so  vulnerable  under  any  attacking  strategy.  The  failure  of  one  or  two 
nodes  can  lead  to  the  failure  of  almost  the  whole  network.  On  the  other  hand,  when  T 
is  high,  the  network  can  endure  multiple  attacks  without  failing.  These  figures  also  show 
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Figure  2-3.  Vulnerability  of  WSN  network  under  normal  setting 


that  previous  attacking  strategies  do  not  work  well  on  safe  networks.  On  safe  networks, 
the  performance  of  FACP  and  CA  algorithms  are  the  best  since  they  can  adapt  to  the 
change  of  network  status  to  choose  nodes  for  attacking.  Additionally,  when  the  network 
is  vulnerable,  FACP  shows  better  performance.  Early  attacked  nodes  push  remaining 
nodes  to  the  boundary  of  failure.  The  redistributed  load  of  later  attacked  nodes  can 
make  them  fail  easily.  The  situation  is  changed  when  networks  are  robust.  CA  algorithm 
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optimizes  the  number  of  failed  nodes  at  each  iteration,  so  its  produced  seed  set  make 
more  nodes  fail. 
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Figure  2-4.  Vulnerability  of  WSN  network  under  safety  setting 


2.5.3  Network  Robustness  Under  Different  Settings 

We  observed  that  scaled  safe  networks  are  more  robustness  than  safe  networks 
and  safe  networks  are  stronger  than  normal  networks.  However,  the  first  kind  of 
networks  often  has  highest  total  of  capacities  while  it  has  the  same  node  load  as  the 
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Figure  2-5.  Vulnerability  of  WSN  network  under  scaled  safety  setting 


remained  two  networks.  Thus,  we  set  up  the  experiment  to  measure  the  robustness  of 
each  kind  of  networks  as  follows.  We  first  generate  the  safe  network,  then  we  choose  a 
suitable  T  value  to  generate  normal  and  scaled  safe  network  such  that  the  total  capacity 
is  the  same.  Fig.  2-6  shows  that  scaled  safe  network  are  the  most  robust  one.  Normal 
and  safe  networks  have  very  close  robustness  although  the  safe  factor  can  help  to  avoid 
the  first  attack.  When  multiple  nodes  are  attacked,  N-1  failure  tolerance  setting  does 
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not  help  much.  The  failure  of  the  first  node  makes  some  other  nodes  reach  the  failure 
limit.  Even  a  tiny  additional  load  can  make  them  fail  and  trigger  a  large  cascading  failure. 
Therefore,  we  observe  that  the  N-1  safe  criterion  does  not  protect  power  networks  under 
cascading  attack. 
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Normal  Network 
Scaled  Safe  Network 
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B  S4  ( T  =  1.6,0!=  1) 


Figure  2-6.  Network  robustness  with  different  failure  tolerance  schemes 


2.5.4  Vertex  Load  and  Network  Robustness 


Now  we  vary  the  value  of  f3  and  measure  the  robustness  of  networks  under  the 
FACP  attacking  strategy.  As  illustrated  in  Fig.  2-7,  the  higher  /3  is,  the  more  vulnerable 
the  network  is  as  the  load  is  concentrated  at  a  few  nodes.  These  nodes  become  the 


Achilles’  heel  of  the  network.  It  suggests  that  we  should  distribute  the  workload  of  nodes 
in  networks  to  make  them  less  vulnerable. 


A  WSN  (7  =  2,  0  =  1) 


B  S4  (7  =  1.6,0  =  1) 


Figure  2-7.  Network  robustness  under  different  load  distribution 
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2.5.5  Network  Topology  and  Network  Robustness 

We  next  evaluate  the  robustness  of  networks  with  different  average  degree. 

As  shown  in  Fig.  2-8,  the  network  with  higher  average  degree  is  more  robust  than 
network  with  lower  average  degree.  In  the  denser  network,  the  load  of  a  failed  node  is 
redistributed  to  a  larger  number  of  neighbors.  Each  neighborhood  node  receives  a  small 
fraction  of  the  load,  thus  it  can  bear  the  additional  load  without  failing. 


Figure  2-8.  Network  topology  and  robustness 

2.6  Related  Works 

The  cascading  failure  has  attracted  a  lot  of  attention  and  been  studied  in  various 
perspective  [6,  22,  30,  40,  46,  53,  54,  60].  The  structural  vulnerability  of  power  networks 
was  studied  in  [6].  The  authors  showed  that  removing  small  fraction  of  highest  degree 
nodes  significantly  reduces  the  connectivity  of  the  network.  After  that,  Hines  et  al.  [30] 
studied  the  network  vulnerability  of  different  classes  of  scale-free  networks  including 
Erdos-Renyi,  preferential-attachment,  and  small-world  networks.  They  showed  that 
different  types  of  networks  behave  differently  under  node  failures.  There  various  models 
of  cascading  failures  were  later  proposed  to  vulnerability  of  networks  under  targeted 
attack  [7,  36,  53,  54],  However,  these  works  mainly  present  different  ranking  methods 
for  nodes  and  select  most  critical  nodes.  These  methods  fail  to  address  the  effect  of  the 
cascading  process,  hence  miss  to  the  most  critical  nodes. 
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2.7  Summary 


In  this  chapter,  we  studied  the  critical  node  detection  problem  under  load  redistribution 
cascading  model.  Based  on  the  new  centrality  which  considers  the  cascading  effect 
when  evaluate  the  importance  of  nodes  in  networks,  we  develop  various  algorithms 
to  identify  critical  nodes:  one  is  purely  based  on  the  new  centrality,  one  is  based  on 
partially  adaptive  centrality,  and  one  is  based  on  fully  adaptive  centrality.  Among  them, 
the  fully  adaptive  centrality  algorithm  continuously  updates  the  centrality  of  nodes  and 
select  the  best  one,  hence  achieves  the  highest  performance  among  the  three.  However, 
this  algorithm  suffers  when  the  network  is  highly  tolerant  to  the  cascading  failure.  We 
propose  the  cooperating  attack  algorithm  which  cooperates  selected  nodes  to  take  down 
protected  nodes  with  high  capacity.  The  performance  guarantee  of  the  cooperating 
attack  algorithm  is  supported  by  both  theoretical  and  experimental  results. 

In  addition,  we  use  proposed  algorithms  to  study  the  vulnerability  of  different  safety 
settings.  We  find  that  networks  with  low  density  is  extremely  vulnerable  under  the 
cascading  failure.  In  this  kind  of  networks,  the  load  of  a  failed  node  is  redistributed  to 
a  small  number  of  neighbors  and  can  fails  them  easily.  On  the  other  hand,  the  load  is 
shred  to  smaller  portions  in  networks  with  high  density.  We  also  discover  that  even  with 
network  of  the  same  topology  and  total  capacity  and  load,  the  network  safety  depends 
a  lot  on  the  distribution  of  protection  cost  (the  gap  between  the  capacity  and  the  load). 
These  shred  the  light  on  designing  safe  networks. 
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CHAPTER  3 

CASCADING  FAILURE  OF  NODES  IN  INTERDEPENDENT  NETWORKS 
In  the  development  of  technology,  infrastructure  networks  are  more  and  more 
interdependent  on  each  other  to  operate  properly.  To  optimize  the  operation  performance 
and  reduce  the  economic  spend,  these  networks  tend  to  utilize  the  support  from 
other  ones.  A  typical  example  is  the  Smart  Grid  in  which  the  power  network  uses  the 
communication  network  to  exchange  operational  information  and  the  communication 
network  uses  the  electricity  from  the  power  network  to  operate.  In  the  meanwhile,  such 
growing  interdependencies  also  dramatically  impact  the  vulnerability  of  these  networks 
since  a  network  is  not  only  exposes  to  threats  to  themselves  but  also  to  the  cascading 
failures  induced  by  from  other  networks.  In  a  typical  attacking  point  of  view,  an  attacker 
would  first  exploit  the  network  weaknesses,  and  then  only  needs  to  target  on  some 
critical  nodes  in  either  power  networks  or  their  interdependent  communication  networks, 
whose  corruptions  bring  the  whole  network  down  to  its  knees.  In  other  words,  nodes 
from  power  networks  depend  heavily  on  nodes  from  their  interdependent  networks 
and  vice  versa.  Consequently,  when  nodes  from  one  network  fail,  they  cause  nodes 
in  the  other  network  to  fail,  too.  For  instance,  an  adversarial  attack  to  any  essential 
Internet  hosts,  e.g.  tier-1  ISPs  such  as  Qwest,  AT&T  or  Sprint  servers,  once  successful, 
may  cause  tremendous  breakdowns  to  both  millions  of  online  services  and  the  further 
large-area  blackout  because  of  the  cascading  failures.  A  real-world  example  is  the 
wide-range  blackout  that  affected  the  majority  of  Italy  on  28  September  2003  [47], 
which  resulted  from  the  cascading  failures  induced  by  the  dependence  between 
power  networks  and  communication  networks.  Therefore,  in  order  to  guarantee  the 
robustness  of  power  networks  without  reducing  their  performance  by  decoupling  them 
from  information  systems,  it  is  important  to  identify  those  critical  nodes  in  interdependent 
power  networks,  beforehand. 
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Figure  3-1 .  Example  of  Interdependent  Power  Network  and  Communication  Network 

There  have  been  many  studies  assessing  the  network  vulnerability  [6,  8,  15,  27, 

32,  45,  48].  Yet,  these  approaches  are  either  designed  only  for  single  networks  or 
heavily  dependent  on  configuration  models  of  interdependent  networks.  The  existing 
approaches  [5,  6,  8,  39]  for  single  networks  are  based  on  various  metrics,  such  as  the 
degree  of  suspected  nodes  or  edges  [6],  the  average  shortest  path  length  [5],  the  global 
clustering  coefficients  [39],  and  the  pairwise  conductivity  [6,  8]  and  so  on.  However, 
when  applying  into  interdependent  networks,  their  performances  drop  tremendously 
since  these  metrics  fail  to  cast  the  cascading  failures  in  interdependent  networks.  Later 
on,  other  researchers  [15,  27,  32,  45,  48]  studied  the  vulnerability  assessment  on 
interdependent  networks,  based  on  the  size  of  largest  connected  component  in  power 
networks  after  cascading  failures.  Although  they  showed  the  effectiveness  of  this  new 
metric,  most  of  them  focus  on  the  artificial  models  of  interdependent  networks,  i.e. , 
random  interdependency  between  networks,  and  ignore  the  detection  of  top  critical 
nodes  in  real  networks. 

Let  us  consider  a  simple  example  of  interdependent  networks  in  Fig.  3-1,  which 
illustrates  a  small  portion  of  power  network  (lower  nodes),  communication  network 
(upper  nodes)  and  their  interdependencies  (dotted  links).  When  we  only  take  the  single 
power  network  into  account,  the  failure  of  u7  destructs  the  power  network  more  than 
that  of  tvi  since  the  largest  connected  component  is  of  size  6  {{ult  u2 _ u6})  when  u7 
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fails,  which  is  smaller  than  9  [{u2, ... ,  ui0})  when  ux  fails.  However,  if  considering  its 
interdependence  upon  the  communication  network,  the  failure  of  ux  will  destroy  the 
power  network  more  than  that  of  u7.  This  is  because  the  failure  of  ux  causes  the  failure 
of  vi  in  the  communication  network,  which  further  fails  v2,  v3,  and  v4  since  they  are 
disconnected  from  the  largest  connected  component.  Due  to  their  interdependence  of 
the  nodes  v4  and  u7  in  the  power  network,  these  cascading  failures  finally  result  in  the 
largest  connected  component  of  the  power  network  to  be  {u8,  ug,  u10}  of  size  only  3.  Yet, 
the  largest  connected  component  remains  the  same  as  in  a  single  power  network  after 
the  failure  of  u7.  This  example  illustrates  an  important  point  that  the  role  of  one  node 
could  be  totally  different  between  single  and  interdependent  networks  with  respect  to  the 
vulnerability  assessment. 

In  this  chapter,  we  investigate  the  vulnerability  of  interdependent  networks  when 
the  cascading  failures  happen  based  on  the  connectivity  of  nodes.  When  studying 
interdependent  networks,  especially  the  power  network  and  the  communication  network, 
it  is  well  known  to  assume  that  a  node  will  failure  when  it  is  disconnected  from  the 
largest  connected  component.  Although  the  assumption  does  not  capture  the  reality, 
it  contains  a  realistic  meaning.  In  real  world,  when  a  node  is  disconnected  the  largest 
connected  component,  it  is  almost  removed  from  the  source  of  power  or  information.  In 
this  chapter,  we  study  the  problem  in  the  context  of  power  network  and  communication 
network,  but  the  results  is  applied  for  all  systems  that  share  the  same  cascading 
mechanism. 

The  rest  of  the  chapter  is  organized  as  follows.  In  Section  3.1 ,  we  introduce 
the  interdependent  network  model  and  the  problem  definition.  After  that,  Section 
3.2  includes  the  hardness  and  inapproximability  results.  The  greedy  framework  is 
proposed  in  Section  3.3,  along  with  the  centrality  metric.  The  experimental  evaluation  is 
illustrated  in  Section  3.4.  The  related  work  is  presented  in  Section  3.6.  Finally,  Section 
3.7  provides  some  concluding  remarks. 
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3.1  Network  Model  and  Problem  Definition 


In  this  section,  we  first  introduce  our  interdependent  network  model  and  the 
well-accepted  model  of  cascading  failure.  Using  these  models,  we  study  the  Interde¬ 
pendent  Power  Network  Disruptor  problem,  to  minimize  the  size  of  largest  connected 
components  in  the  power  network  after  cascading  failures  by  selecting  a  certain  number 
of  target  nodes. 

3.1.1  Interdependent  Network  Model 

Considering  an  interdependent  system,  we  abstract  it  into  two  graphs,  Gs  =  (VSl  Es) 
and  Gc  =  (Vc,  Ec),  and  their  interdependencies,  Esc.  Gs  and  Gc  represent  the  power 
network  and  communication  network  respectively.  Each  of  them  has  a  set  of  nodes 
Vs,  Vc  and  a  set  of  links  Es,  Ec,  which  are  referred  to  as  intra-links.  In  addition,  Esc  are 
inter-links  coupling  Gs  and  Gc,  i.e.,  Esc  =  {(u,  v)\u  e  V5lve  Vc}.  A  node  u  e  Vs  is 
functional  if  it  is  connected  to  the  giant  connected  component  of  Gs  and  at  least  one  of 
its  interdependent  nodes  in  Gc  is  in  a  working  state.  The  whole  interdependent  system  is 
referred  to  as  3(GS,  Gc,  Esc). 

3.1.2  Cascading  Failures  Model 

We  use  a  well-accepted  cascading  failure  model,  which  has  been  validated 
and  applied  in  many  previous  works  [1 5,  27,  32,  45,  48].  Initially,  there  are  a  few 
critical  nodes  failed  in  network  Gs,  which  disconnects  a  set  of  nodes  from  the  largest 
connected  component  of  Gs.  Due  to  the  interdependency  of  two  networks,  all  nodes 
in  Gc  only  connecting  to  failed  nodes  in  Gs  are  also  impacted,  and  therefore  stop 
working.  Furthermore,  the  failures  cascade  to  nodes  which  are  disconnected  from  the 
largest  connected  component  in  Gc  and  cause  further  failures  back  to  Gs.  The  process 
continues  back  and  forth  between  two  networks  until  there  is  no  more  failure  nodes. 

3.1.3  Problem  Definition 

Definition  3  (IPND  problem).  Given  an  integer  k  and  an  interdependent  system 
3(GS,  Gc,  Esc),  which  consists  of  two  networks  Gs  =  (Vs,  Es),  Gc  =  (Vc,  Ec)  along  with 
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their  interdependencies  Esc.  Let  LG s(  T)  be  the  size  of  the  largest  connected  component 
of  Gs  after  the  cascading  failures  caused  by  the  initial  removal  of  the  set  of  nodes 
T  c  Vs  in  Gs.  The  IPND  problem  asks  for  a  set  T  of  size  at  most  k  such  that  LGS(  T)  is 
minimized. 

In  the  rest  of  the  chapter,  the  pairs  of  terms  interdependent,  networks  and  coupled 
networks,  node  and  vertex,  as  well  as  edge  and  link,  are  used  interchangeably. 

3.2  Computational  Complexity 

In  this  section,  we  first  show  the  NP-completeness  of  IPND  problem  by  reducing 
it  from  maximum  independent  set  problem,  which  further  implies  that  IPND  problem  is 
NP-hard  to  be  approximated  within  the  factor  2  -  e  for  any  e  >  0. 

Theorem  3.1 .  IPND  problem  is  NP-complete. 

Proof.  Consider  the  decision  of  IPND  that  asks  whether  the  graph  Gs  =  (Vs,  Es)  in  an 
interdependent  system  3(GS,  Gc,  Esc )  contains  a  set  of  vertices  T  c  Vs  of  size  k  such 
that  the  largest  connected  component  in  GS[VS  \  T ]  after  cascading  failures  is  at  most  z 
for  a  given  positive  integer  z.  Given  T  e  Vs,  we  can  compute  in  polynomial  time  the  size 
of  the  largest  connected  component  in  Gs  after  the  cascade  failures  when  removing  T 
by  iteratively  identifying  the  largest  connected  component  and  removing  disconnected 
vertices  in  Gs  and  Gc.  This  implies  IPND  e  NP 

To  prove  that  IPND  is  NP-hard,  we  reduce  it  from  the  decision  version  of  the 
maximum  independent  set  (MIS)  problem,  which  asks  for  a  subset  /  c  V  with  the 
maximum  size  such  that  no  two  vertices  in  /  are  adjacent.  Let  an  undirected  graph 
G  =  {V,  E)  where  \V\  —  n  and  a  positive  integer  k  <  \V\  be  any  instance  of  MIS.  Now 
we  construct  the  interdependent  system  3(GS,  Gc,  Esc )  as  follows.  We  select  Gs  =  G  and 
Gc  to  be  a  clique  of  size  |  \/s|.  Since  Gs  and  Gc  have  the  same  size  in  our  construction, 
to  construct  the  interdependency  between  Gs  and  Gc,  we  randomly  match  each  node  in 
14  to  some  arbitrary  nodes  in  Vc  so  as  to  form  a  one-to-one  correspondence  between 
14  and  Vc.  An  example  is  illustrated  in  Fig.  3-2.  We  show  that  there  is  an  MIS  of  size 
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at  most  k  in  G  iff  Gs  in  3(GS,  Gc,  Esc )  has  an  IPND  of  size  n  -  k  such  that  the  largest 
connected  component  in  Gs  after  cascading  failures  is  of  size  at  most  1 . 


Figure  3-2.  An  example  of  reduction  from  MIS  to  IPND 

First,  suppose  /  c  V  is  an  MIS  for  G  with  |/|  <  k.  By  our  construction,  the  largest 
connected  component  of  Gs[l]  has  the  size  1  since  there  is  no  more  cascading  failure  in 
the  clique  Gc.  That  is,  Vs  \  /  is  also  an  IPND  of  3(GS,  Gc,  Esc). 

Conversely,  suppose  that  V  c  Vs  with  |  T'\  =  n  -  k  is  an  IPND  of  T(GS,  Gc,  fsc), 
that  is,  the  largest  connected  component  of  GS[VS  \  V]  is  of  size  1 .  We  show  that 
Vs  \  V  is  also  an  MIS  of  G.  Since  the  failure  of  nodes  in  Gc  will  not  cause  any  cascading 
failure  in  Gs,  the  largest  connected  component  of  GS[VS  \  V]  is  of  size  1  iff  Vs  \  V  is  an 
independent  set  in  Gs.  That  is,  Us  \  V  is  also  an  MIS  of  G.  □ 

As  IPND  is  NP-complete,  one  will  question  how  tightly  we  can  approximate  the 
solution,  leading  to  the  theory  of  inapproximability.  The  inapproximability  factor  gives 
us  the  lower  bound  of  near-optimal  solution  with  theoretical  performance  guarantee. 

That  said,  no-one  can  design  an  approximation  algorithm  with  a  better  ratio  than  the 
inapproximability  factor.  Then,  we  show  that  the  above  reduction  implies  the  (2  - 
e)-inapproximability  factor  for  IPND  in  the  following  corollary. 

Corollary  1 .  IPND  problem  is  NP-hard  to  be  approximated  into  2  -  e  for  any  e  >  0. 

Proof.  We  use  the  reduction  in  the  proof  of  Theorem  3.1 .  Suppose  that  there  is  a 
(2  -  ^-approximation  algorithm  A  for  IPND.  Then  A  can  return  the  largest  connected 
component  in  Gs  of  size  less  than  2  in  our  constructed  instance  if  the  optimal  size  is  1 . 
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Thus  algorithm  A  can  be  applied  to  solve  MIS  on  G  in  polynomial  time  because  this  size 
is  integral.  This  contradicts  to  the  NP-hardness  of  MIS.  □ 

3.3  Greedy  Framework  for  IPND  Problem 

In  this  part,  we  present  different  algorithms  to  detect  the  top  critical  nodes  using  the 
greedy  framework,  which  has  been  illustrated  as  one  of  the  most  popular  and  effective 
approaches  to  solve  hard  problems.  The  idea  is  to  iteratively  choose  the  most  critical 
node,  whose  removal  degrades  the  functionality  of  the  network  as  much  as  possible. 

In  detail,  we  propose  three  following  different  strategies  to  select  a  critical  node  in  the 
system  at  each  iteration: 

1 )  Select  a  node  that  maximizes  the  number  of  failed  nodes  after  the  cascading 
failure. 

2)  Select  a  node  that  decreases  the  structural  strength  of  the  system  as  much  as 
possible.  That  is,  when  the  number  of  removed  nodes  is  large  enough,  the  system 
will  become  weak.  Therefore,  the  number  of  failed  nodes  increases  considerably 
under  the  effect  of  cascading  failures. 

3)  Select  a  node  that  not  only  increases  the  number  of  failed  nodes  but  decreases  the 
structural  strength  as  well. 

In  the  rest  of  this  section,  we  describe  three  algorithms  corresponding  to  the  above 
strategies. 

3.3.1  Maximum  Cascading  (Max-Cas)  Algorithm 

In  maximum  cascading  (Max-Cas)  algorithm,  we  iteratively  select  a  node  u  that 
leads  to  the  most  number  of  new  failed  nodes,  i.e.,  the  maximum  marginal  gain  to 
the  current  set  of  attacked  nodes  T.  When  a  new  node  u  fails,  it  results  in  a  chain  of 
cascading  failures.  The  number  of  new  failed  nodes,  referred  to  as  cascading  impact 
number,  can  be  computed  by  simulating  the  cascading  failures  with  the  initial  set  T u{r/} 
on  the  interdependent  system  3  as  described  in  Section  3.1.2.  However,  the  simulation 
of  cascading  failures  is  time-consuming  due  to  its  calculation  of  cascading  failures 
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between  two  networks.  Each  step  in  the  cascading  requires  to  identify  the  largest 
connected  component  of  each  network. 

To  this  end,  we  further  improve  the  running  time  of  our  algorithm  by  reducing  the 
number  of  simulations.  The  idea  is  only  to  check  potential  nodes  whose  removal  creates 
at  least  one  more  failed  node  in  the  same  network  due  to  the  cascading  failures.  That 
is,  this  node  (or  its  coupled  node)  disconnects  the  network  which  it  belongs  to,  i.e.,  it  (its 
coupled  node)  is  an  articulation  node  of  Gs  (or  Gc),  which  is  defined  as  any  vertex  whose 
removal  increases  the  number  of  connected  components  in  Gs  (or  Gc).  The  reason  is 
illustrated  in  the  following  lemma. 

Lemma  1.  Given  an  interdependent  system  3(  GS,  Gc,  Esc),  removing  a  node  u  e  Vs  from 
the  system  causes  at  least  one  more  node  fail  due  to  the  cascading  failure  iff  u  (or  its 
coupled  node  v  e  Vc)  is  an  articulation  node  in  Gs  (or  Gc). 

Proof.  If  u  is  an  articulation  node  of  Gs,  the  removal  of  u  will  increase  the  number  of 
connected  components  in  Gs  at  least  to  two.  By  the  definition  of  cascading  failures  in  Gs, 
all  nodes  disconnected  from  the  largest  connected  component  will  be  failed.  Similarly, 
when  v  is  an  articulation  node  of  Gc,  removing  u  causes  v  fail,  then  there  is  at  least  one 
more  nodes  in  Gc  fail.  After  that,  these  nodes  make  coupled  nodes  in  Gs  fail  as  well.  On 
the  other  hand,  if  neither  u  nor  v  are  articulation  nodes,  the  removal  of  u  only  makes 
v  fail,  and  the  rest  of  two  networks  are  still  connected,  which  terminates  the  cascading 
failures.  □ 

According  to  this  property,  the  proposed  algorithm  first  identifies  all  articulation 
nodes  in  both  residual  networks  using  Hopcroft  and  Tarjan’s  algorithm  [31].  Note 
that  this  algorithm  runs  in  linear  time  on  undirected  graphs,  which  is  faster  than  one 
simulation  of  cascading  failures.  Thus,  the  running  time  of  each  iteration  is  significantly 
improved  especially  when  the  number  of  articulation  nodes  is  small.  Denote  Max  - 
Cas(GSl  T,  { u })  as  the  impact  number  of  u,  Algorithm  5  describes  the  details  to  detect 
critical  nodes.  In  Algorithm  5,  since  it  takes  0(n)  time  to  compute  the  cascading  impact 
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number  for  each  node  and  at  most  \A\  <  n  articulation  nodes  will  be  evaluated,  the 
running  time  is  0(kn2)  in  the  worst  case.  In  practice,  the  actual  running  time  is  much 
less  due  to  the  small  size  of  A,  which  is  further  illustrate  in  Section  3.4. 


Algorithm  5  Max-Cas  Greedy  Algorithm 
Input:  Interdependent  system  T(GS,  Gc,  Esc),  an  integer  k 
Output:  Set  of  k  critical  nodes  in  T  e  Vs 

T g-  0 

for  /  =  1  to  k  do 

As,  Ac  g-  set  of  articulation  nodes  of  Gs  and  Gc  respectively 
A  i —  {u  G  Vs\u  G  As  V  ((rv,  v)  G  Esc  A  v  G  Ac)} 

if  A  /  0  then 

u  g-  argmaxue^  Max-Cas (Gs,  T,  {u}) 

Tg  TU  {u} 

else 

u  g-  any  node  in  t/s  \  T 

end  if 

Update  3[VS  \  T] 

end  for 

Return  T 


3.3.2  Iterative  Interdependent  Centrality  (IIC)  Algorithm 

As  one  can  see,  Max-Cas  algorithm  prefers  to  choose  nodes  that  can  decrease 
the  size  of  networks  immediately.  This  can  mislead  the  algorithm  to  select  boundary 
nodes  and  affect  its  efficiency  for  large  k  since  the  residual  networks  still  remain 
highly  connected  even  many  critical  nodes  have  been  removed.  An  alternative 
strategy  is  to  identify  the  hub  nodes  which  plays  a  role  to  connect  other  nodes 
together  in  the  network.  Actually,  this  strategy  has  been  proved  to  be  efficient  in 
single  complex  networks  by  Albert  etal.  [6],  in  which  the  removal  of  a  small  fraction 
of  nodes  with  highest  degree  centrality  has  been  shown  to  fragmentize  the  network  to 
small  components.  However,  since  this  centrality  method  is  only  for  single  networks, 
the  development  of  a  new  centrality  measure  is  in  an  urgent  need  for  interdependent 
systems. 
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Intuitively,  this  new  centrality  measure  is  required  to  capture  both  the  intra-centrality 
(the  centrality  of  nodes  in  each  networks)  and  inter-centrality  (the  centrality  formed 
by  the  interconnections  between  two  networks).  Given  an  interdependent  system 
3(GS,  Gc,  Esc),  node  u  in  14  is  more  likely  to  be  critical  if  its  coupled  node  v  e  Gc  is 
critical.  Furthermore,  when  node  u  is  considered  as  a  critical  node,  its  neighbors  are 
also  more  likely  to  become  important  since  the  failures  of  these  nodes  can  cause  u  fail. 
That  said,  the  criticality  of  these  nodes  imply  the  criticality  of  their  coupled  nodes.  To 
capture  this  complicated  relation  in  interdependent  systems,  we  develop  an  iterative 
method  to  compute  the  centrality  of  nodes,  called  Iterative  Interdependent  Centrality 
(IIC).  Initially,  the  centralities  of  all  nodes  in  Gs  are  computed  by  the  traditional  centrality, 
e.g.,  degree  centrality,  betweenness  centrality,  etc.  After  that,  these  centralities  of 
nodes  in  Gs  are  reflected  to  coupled  nodes  in  Gc  and  the  centralities  of  nodes  in  Gc  are 
updated  based  on  the  reflected  values.  The  centralities  of  nodes  in  Gc  continue  to  be 
reflected  on  nodes  of  Gs  and  update  centralities  of  these  nodes.  Two  key  points  of  IIC 
are  the  updating  function  and  the  convergence. 

3.3.2.1  Updating  function 

Considering  the  reflected  values  from  the  other  network  as  the  weight  of  nodes,  we 
modify  the  weighted  degree  as  the  updated  centrality  of  nodes,  which  is  defined  as 

C(u)  =aw(u)  +  (1-  a)  £  -±1 

v\(u,v)eE  V 

where  w(-)  is  the  reflected  values  (or  the  weight  of  nodes)  and  the  reservation  factor  a 
lying  in  the  interval  [0, 1].  The  underlying  reason  we  use  weighted  degree  is  that  a  node 
is  usually  more  critical  if  most  of  its  neighbors  are  critical  nodes.  The  reservation  factor 
shows  that  the  importance  of  each  node  is  not  only  dependent  on  the  reflected  values 
from  the  other  network,  but  also  the  role  in  its  own  network. 
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3.3. 2.2  Convergence 

Next,  we  show  that  the  centralities  of  nodes  can  be  computed  based  on  matrix 

multiplications  and  prove  the  convergence  via  this  property.  Let  xf  =  [x^,  x*s _ x^\, 

yf  =  [ ylc.ylc _ y*c\  be  the  normalized  centrality  vector  after  tth  iteration  of  Gs  and  Gc. 

Suppose  that  two  interdependent  nodes  have  the  same  position  vectors  xf  and  yf,  i.e., 
v?  and  vf  are  interdependent.  Then,  we  have 

XU  =  aytr1  +  (1  -  «)  Vu  G  Us 

v:(u,v)eEs  v 

Yu  =  axtT1  +  (1  -  a)  ~Y~  -  Wu  G  ^ 

v.(u,v)eEc 

Note  that  if  we  divide  these  vectors  by  a  constant,  then  they  still  represent  the 
centrality  of  nodes  in  the  systems.  Thus,  after  each  iteration,  these  centrality  vectors  are 
divided  by  some  constants  Cs  and  Cc  which  are  chosen  later  to  prove  the  convergence. 

Xf  =  x7Cs,  yf  =  y 7Q 


Let  A/P  and  A/P  be  n  x  n  matrices  such  that 


Kv  =  < 


a  if  u  —  v 

d-1  if  (u,  v )  e  Es 

0  otherwise 


Mc  —  < 

1  1  uv  x 


a  if  u  =  v 

d-1  if  (u,  v )  e  Ec 

0  otherwise 

Then  the  relationship  between  x1  and  yl  is  rewritten  as: 


x 


t 


MY-1 

Cs 


MY-1 

Q 
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Therefore 


,  M5Mcxt~2  Mxl~2 

x  = - = - 

C  C  C  C 

where  M  —  MSMC  is  called  the  characteristic  matrix.  Next,  we  analyze  the  condition 
of  this  matrix  to  guarantee  that  converges  by  using  the  Jordan  canonical  form  of  M, 
defined  as  follows. 

Theorem  3.2  (Jordan  Canonical  Form  [50]).  Any  n  x  n  matrix  M  with  n  eigenvalues 
| Ai |  >  | A2 1  >  ...  >  |A„|  can  be  represented  as  M  =  PJP-1  where  P  is  an  invertible  matrix 
and  J  is  Jordan  matrix  which  has  form 

J  =  diag(Jlt ...  ,JP) 

where  each  block  J,,  called  Jordan  block,  is  a  square  matrix  of  the  form 

X,  1 

^  A'  -■ 

1 

A/_ 

According  to  its  above  definition,  the  power  of  the  matrix  M  can  be  computed  as 
follows 

Mk  =  (PJP"1)^  =  PJkp-1 

Hence,  Mk  converges  when  k  ->  oo  if  and  only  if  Jk  converges.  The  powers  of  J  is 
computed  via  the  powers  of  Jordan  block  Jk,Jk _ Ip- 

Jk  =  diag(Ji _ !kP) 
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where 


Jk  = 


Af  CK-1  ©A? 


i  k—2 


xk 


k 


k- 1 


(A-2)4~w~2> 


A* 


Note  that  the  powers  of  J  converges  if  and  only  if  the  powers  of  all  Jordan  blocks 
converge.  Thus,  we  focus  on  the  convergence  of  a  block  Jk  as  stated  in  the  following 
lemma. 

Lemma  2.  The  convergence  of  ad  x  d  Jordan  block  J,  depends  only  d  and  A,-; 

(V  If  I  A/|  >  1  then  Jk  does  not  converge  when  k  — >■  oo. 

(2)  If  \X,\  <  1  then  Jk  converges  to  0  when  k  ->■  oo. 

(3J  /f  |A,|  =  1  anc/  A,  /  1,  then  Jk  does  not  converge  when  k  ->•  oo. 

IfXi  =  1  and  d  =  1,  then  Jk  =  i  . 

(5J  If  X;  =  1  and  d  >  1,  then  J-  does  not  converge  when  k  ^  oo. 


Proof.  Cases  (1),  (3),  (4),  and  (5)  are  trivial,  thus  we  only  show  the  proof  for  case  (ii). 
With  | A/|  <  1,  every  element  of  J-  has  form  ©A -~J  which  converge  to  0  as  k  ->  oo.  □ 

According  to  this  lemma,  when  normalized  factors  Cs,  Cc  satisfies  C5Cc  =  |Ai|,  we 
will  have 


M 

cZ 


t/2 


X°  =  P 


J 

|a7 


t/2 


P 


^X0 


Clearly,  x(  will  converge  when  converges.  Then,  we  have  the  following 

theorem. 

Theorem  3.3.  The  centrality  vector  converges  if  and  only  if  the  characteristic  matrix  has 
exactly  one  maximum  magnitude  eigenvalue. 

To  compute  the  converged  centrality  vector,  we  first  choose  a  such  that  M  has 
Ai  >  A2.  In  practice,  we  choose  a  =  0.5  and  centrality  vectors  always  converge. 


51 


Although  it  seems  necessary  to  compute  the  largest  eigenvalue  of  M,  we  propose  an 
alternative  method  to  avoid  this  time-consuming  computation  as  follows.  Suppose  that 
x2t  converges  to  a  vector  x  after  t0  iterations  i.e.  x  =  Now  we  define  the  sequence 
of  vectors  z0  =  x°,  z/+1  =  ygy,  then: 

Mtoz0  _  Mtox° 

z“  ~  ns1  iMzf| "  ns1  \mz,\ 

It  means  that  x  =  Azt0  where  A  is  a  scalar  value.  Therefore  zt0  =  p- .  Thus  we  can 
compute  the  centrality  vector  using  the  recursive  formula  of  z  as  described  in  Algorithm 
6,  then  use  this  algorithm  as  sub-routine  to  detect  critical  nodes  in  Algorithm  7. 


Algorithm  6  Iterative  Interdependent  Centrality 
Input:  Characterize  matrix  M  and  allowed  error  e 
Output:  Centrality  vector 

X  4 —  1 

error  < — boo 

while  error  >  e  do 
y  4—  Mx 
norm  4—  ||y|| 
y  ~  y /norm 
error  4—  ||y  —  x|| 
x  i —  y 
end  while 
Return  x 


Algorithm  7  IlC-based  Algorithm 
Input:  Interdependent  system  3{GSl  Gc,  Esc),  an  integer  k 
Output:  Set  of  k  critical  nodes  in  T  e  Vs 

T  <-  0 

for  /  =  1  to  k  do 

a  4-  0.5,  Compute  M 

e  4-  10-8 

Compute  centrality  vector  x  using  Algorithm  6. 
u  argmaxys\7-x[u] 

T  •(-  T  U  {u} 

Update  3[VS  \  T] 

end  for 

Return  T 
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Time  Complexity:  Since  two  matrices  M5  and  Mc  have  only  (2|ES|  +  |  V/s|)  and 
(|EC|  +  |\/c|)  non-zero  elements,  the  product  Mx  =  MsMcx  takes  0(2\ES\  +  |V4|  +  2|EC|  + 
|  Vc|)  time  using  sparse  matrix  multiplication.  The  convergence  speed  is  j^j,  thus  the 

number  of  iterations  is  0('og(f/f ).  Therefore,  the  total  running  time  to  compute  iterative 

log^ir 

interdependent  centrality  is  0((|ES  +  Ec|)^^).  Thus,  the  total  time  to  detect  critical 

nodes  is  0((t(|£s  +  EcDfW)- 

loara 

3.3.3  Hybrid  Algorithm 

Motivated  by  the  advantages  of  Max-Cas  and  IIC  algorithms,  we  further  design  a 
hybrid  algorithm  by  taking  advantage  of  them.  As  one  can  see,  Max-Cas  only  works 
well  when  networks  are  loosely  connected  since  it  mainly  aims  to  create  as  many  failed 
nodes  as  possible  instead  of  making  the  network  as  weak  as  possible.  On  the  other 
hand,  IIC  algorithm  can  make  the  network  weak  but  it  does  not  work  well  as  Max-Cas 
when  networks  are  loosely  connected.  Thus,  the  idea  of  hybrid  algorithm  is  to  remove  as 
many  nodes  as  possible  and  make  networks  weaker  in  turn.  That  is,  we  use  Max-Cas 
and  IIC  algorithms  in  odd  and  even  iterations  respectively,  as  described  in  Algorithm  8. 
Since  the  running  time  of  IIC  is  much  smaller  than  Max-Cas,  its  running  time  is  about  a 
half  of  Max-Cas,  which  will  be  empirically  shown  in  Section  3.4. 


Algorithm  8  Hybrid  Algorithm 
Input:  Interdependent  system  3{GSl  Gc,  Esc),  an  integer  k 
Output:  Set  of  k  critical  nodes  in  T  e  Vs 

T  <-  0 

for  /  =  1  to  k  do 

if  /  is  odd  then 

Select  u  as  Max-Cas  algorithm 

else 

Select  u  as  IIC  algorithm 

end  if 

T  <-  TU{u} 

Update  3[VS  \  T] 

end  for 

Return  T 
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3.4  Experimental  Evaluation 


3.4.1  Dataset  and  Metric 

In  the  experiment,  we  evaluate  Max-Cas,  IIC,  and  Hybrid  algorithms,  with  respect 
to  the  size  of  giant  connected  component  (GCC)  and  the  running  time,  on  various 
real-world  and  synthetic  datasets. 

In  terms  of  power  networks,  we  use  both  real  Western  States  power  network  of  the 
US  [55]  with  4941  nodes  and  6594  edges,  and  the  synthetic  scale  free  networks.  This 
network  as  well  as  other  communication  networks  belong  to  a  class  of  networks  called 
scale-free  networks  in  which  the  number  of  nodes  with  degree  d,  denoted  by  P (d),  is 
proportional  to  d~P  i.e.,  P (d)  ~  d~P  for  some  exponential  factor  f3  >  0.  According  to 
[9],  power  networks  are  found  to  have  their  exponential  factors  /3  between  2.5  and  4. 

In  order  to  do  a  more  comprehensive  experiment,  we  further  generate  more  types  of 
synthetic  power  networks  with  different  exponential  factors,  using  igraph  package  [23]. 

In  addition,  due  to  the  lack  of  data  describing  interdependencies  between  any 
communication  networks  and  the  real-world  power  network,  we  use  the  synthetic 
scale-free  networks,  representing  communication  networks,  e.g.  Internet,  telephone 
network,  etc.  Since  most  communication  networks  are  observed  to  have  the  scale 
free  property  with  their  exponential  factors  j3  between  2  and  2.6  [12,  57],  we  generate 
communication  networks  with  component  factors  of  2.2  or  2.6. 

For  the  sake  of  coupling  method,  motivated  by  the  observation  from  real-world 
interdependent  systems  in  [44],  we  develop  a  realistic  and  practical  coupling  approach, 
Random  Positive  Degree  Correlation  Coupling  (RPDCC)  scheme.  In  this  scheme,  nodes 
with  high  degrees  tend  to  coupled  together  and  so  do  nodes  with  low  degrees,  thus 
the  degree  correlation  of  coupled  nodes  is  positive  as  described  in  [44].  The  detail  of 
RPDCC  strategy  will  be  discussed  later  in  Section  3.5. 

Finally,  each  experiment  on  synthesized  systems  is  repeated  100  times  to  compute 
the  average  results. 
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3.4.2  Performance  of  Proposed  Algorithms 

In  order  to  show  the  effectiveness  of  our  proposed  algorithms,  due  to  the  intractability 
of  IPND  problem  and  the  time  consumption  to  obtain  optimal  solutions,  we  focus  on 
comparing  them  with  traditional  centrality  approaches  which  are  often  used  in  network 
analysis  [11],  including  degree  centrality  (DC),  closeness  centrality  (CC),  betweenness 
centrality  (BC)  [13],  and  bridgeness  centrality  (BRC)  [33].  In  these  approaches, 
the  k  nodes  of  largest  centralities  in  power  networks  are  selected  as  critical  nodes. 
Particularly,  we  test  our  approaches  on  the  following  three  types  of  datasets: 

1)  WS  System:  US  Western  states  power  network  —  Scale-free  communication 
network  with  (3  —  2.2. 

2)  SS  System:  Scale-free  power  networks  with  (3  =  3.0  —  Scale-free  communication 
network  with  (3  —  2.2. 

3)  Eq-SS  System:  Scale-free  power  and  communication  networks  with  the  same 
/3  =  2.6. 

Here  we  choose  the  exponential  factor  (3  according  to  the  real-world  power  networks 
and  communication  networks,  as  mentioned  above  in  3.4.1. 

Fig.  3-3  reports  the  comparison  of  performance  between  different  approaches  in 
these  three  interdependent  systems.  In  these  figures,  all  of  three  proposed  algorithms 
outperform  CC  (the  best  traditional  centrality  approach)  for  any  number  of  k  critical 
nodes.  When  k  becomes  larger,  the  interdependent  systems  have  totally  destroyed 
by  choosing  these  critical  nodes  using  our  algorithms,  while  more  than  60%  of  nodes 
remain  intact  if  selecting  nodes  with  highest  traditional  centralities.  Especially  in  WS 
interdependent  system  consisting  of  real-world  US  Western  states  power  system,  the 
number  of  functional  nodes  remains  nearly  5000  even  50  nodes  are  identified  using  CC, 
whereas  it  is  sufficient  to  destroy  the  whole  system  only  by  removing  20  nodes  using 
our  Hybrid  or  Max-Cas  approach.  That  is,  these  traditional  approaches  perform  much 
worse  compared  with  our  algorithms,  especially  when  the  number  of  attacked  nodes 
is  large.  Thus,  the  traditional  methods  cannot  identify  a  correct  set  of  critical  nodes 
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in  interdependent  systems.  The  reason  is  that  these  approaches  can  only  reflect  the 
importance  of  each  node  in  single  power  networks  rather  than  interdependent  systems, 
and  they  ignore  the  impact  of  cascading  failures  to  interdependent  systems. 


A 


B 


c 


D 


E  F 


Figure  3-3.  Performance  Comparison  on  Different  Interdependent  Systems:  WS  System 
(A,  B),  SS  System  (C,  D),  and  Eq-SS  System  (E,  F). 


Comparing  our  three  proposed  approaches,  as  revealed  in  Fig.  3-3,  IIC  runs 
fastest  in  spite  of  its  worst  performance,  roughly  1000  times  faster  than  Max-Cas  in  WS 
interdependent  system.  We  also  notice  that  the  performance  of  Max-Cas  and  Hybrid 
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algorithms  is  very  close  while  Hybrid  algorithm  runs  about  2  times  faster  than  Max-Cas 
algorithm.  In  particular,  Max-Cas  has  better  performance  than  Hybrid  algorithm  in 
SS  interdependent  system,  yet  worse  performance  in  the  other  two  systems.  This  is 
because  the  power  network  with  (3  =  3.0  is  very  loosely  connected  and  fragile  in  SS 
interdependent  systems.  Thus,  Max-Cas  strategy  can  destroy  the  system  quickly  and 
easily.  However,  since  nodes  are  better  connected  in  the  other  two  systems,  especially 
Eq-SS,  Hybrid  algorithm  is  more  efficient  due  to  its  strategy  that  makes  networks  weak 
first  and  then  destroys  them.  As  illustrated  in  Fig.  3-3E,  the  performance  of  Hybrid  is 
lower  than  Max-Cas  initially,  but  higher  than  Max-Cas  when  the  networks  get  weak 
enough.  Additionally,  in  all  of  these  systems,  when  the  number  of  removed  nodes  reach 
to  a  certain  value,  the  whole  system  is  failed.  These  numbers  are  about  20  for  WS  and 
SS  system  and  40  for  Eq-SS  system.  This  shows  that  interdependent  networks  are 
vulnerable,  especially  when  loosely  connected. 

3.4.3  Vulnerability  Assessment  of  Interdependent  Systems 

With  the  effectiveness  of  Hybrid  algorithm  observed  through  the  above  experiments, 
we  confidently  use  it  to  further  assess  the  vulnerability  of  interdependent  systems  and 
explore  some  insight  properties. 

3.4.3.1  Different  coupled  communication  networks 

We  are  interested  in  investigating  the  vulnerability  of  a  certain  power  network 
when  it  is  coupled  with  different  communication  networks.  First,  we  fix  one  synthetic 
power  network  by  generating  a  scale-free  network  with  f3  =  3  according  to  [9].  The 
coupled  communication  networks  are  also  generated  as  scale-free  networks,  with  their 
exponential  factors  (3  between  2.5  and  2.7,  as  mentioned  above.  All  generated  networks 
have  1 000  nodes. 

As  illustrated  in  Fig.  3-4,  the  power  networks  tend  to  be  more  vulnerable  when  their 
coupled  communication  networks  are  more  sparse,  i.e.,  with  larger  exponential  factor 
/3.  That  is,  it  gives  us  an  intuition  that  the  power  networks  will  become  more  vulnerable 
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Figure  3-4.  The  Vulnerability  Of  A  Fixed  Power  Network 

when  their  coupled  networks  are  easy  to  be  attacked.  In  particular,  in  order  to  destroy 
the  power  networks,  the  numbers  of  critical  nodes  in  them  are  23,  17,  and  11  when  their 
coupled  communication  networks  have  f3  =  2.5,  /3  =  2.6  and  (3  =  2.7,  respectively, 
which  indicates  some  key  thresholds  to  protect  the  function  of  power  networks  with  the 
knowledge  of  their  interdependent  networks. 

3.4.3.2  Disruptor  threshold 

In  this  part,  we  evaluate  an  important  indicator  of  the  vulnerability,  the  disruptor 
threshold  which  is  the  number  of  nodes  whose  removal  totally  destroys  the  whole 
system.  The  smaller  it  is,  the  more  vulnerable  the  system  is.  We  would  like  to  observe 
the  dependence  of  the  disruptor  threshold  on  the  network  size.  Particularly,  we  generate 
two  scale-free  networks  with  the  same  size  and  exponential  factors  (3  of  3.0  and  2.2, 
corresponding  to  power  and  communication  networks,  then  couple  them  using  RPDCC 
scheme. 

As  shown  in  Fig.  3-5,  the  disruptor  threshold  provided  by  all  proposed  algorithms 
is  small  and  increases  slowly  with  respect  to  the  growth  of  the  network  size.  When  the 
network  size  is  raised  by  5  times,  from  1000  to  5000  nodes,  the  disruptor  threshold 
only  increases  roughly  3  times.  When  the  size  of  network  is  5000,  the  disruptor 
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Figure  3-5.  The  Disruptor  Threshold  with  Different  Network  Sizes 

thresholds  of  Max-Cas  and  Hybrid  algorithms  are  roughly  51  and  57.  This  implies 
that  the  removal  of  1%  number  of  nodes  is  enough  to  destroy  the  whole  system.  Even 
the  IIC  algorithm  needs  to  destroy  only  1 .5%  fraction  of  nodes  to  break  the  system 
down.  Large  interdependent  systems  seem  to  be  extremely  vulnerable  under  different 
attack  strategies  due  to  the  following  reason.  When  the  network  size  grows  up,  the 
possibility  that  a  high  degree  node  is  dependent  on  a  low  degree  node  also  runs  up. 

As  a  result,  it  is  easier  to  disable  the  functionality  of  high  degree  nodes  which  often 
play  an  important  role  in  the  network  connectivity.  Therefore,  the  vulnerability  of  the 
interdependent  system  needs  to  be  reevaluated  regularly,  especially  fast  growing  up 
systems. 

3.4.3.3  Different  coupling  schemes 

Another  interesting  observation  is  to  investigate  the  impacts  of  the  way  nodes  are 
coupled  to  the  vulnerability  of  interdependent  system.  Apart  from  the  RPDCC  scheme, 
we  evaluate  the  robustness  with  other  three  coupling  strategies,  as  follows: 

1)  Same  Degree  Order  Coupling  (SDOC)\  The  nodes  of  Ith  highest  degree  in  two 
networks  are  coupled  together. 

2)  Reversed  Degree  Order  Coupling  (RDOC):  The  node  of  Ith  highest  degree  in  one 
network  is  coupled  with  the  node  of  Ith  lowest  degree  in  the  other  network. 
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3)  Random  Negative  Degree  Correlation  Coupling  ( RNDCC ■):  A  node  of  higher 

degree  in  one  network  are  randomly,  with  lower  probability,  to  couple  with  another 
node  of  higher  degree  nodes  in  the  other  network. 

Note  that  the  RNDCC  scheme  is  the  opposite  strategy  to  the  RPDCC  scheme  (in 

Section  3.5).  We  test  on  the  interdependent  systems,  consisting  of  a  power  network 

with  p  —  3  and  a  communication  network  with  /3  —  2.2  using  the  four  different  coupling 

schemes.  All  networks  have  1000  nodes. 

Fig.  3-6  reports  the  vulnerability  of  power  networks  when  coupling  them  with 

communication  networks  in  different  manners.  As  one  can  see,  SDOC  provides  the 

most  robust  interdependent  system,  although  it  is  not  practical.  The  size  of  the  remained 

giant  connected  component  decreases  slowly  when  the  number  of  removed  nodes 

increases.  On  the  other  hand,  RDOC  makes  the  system  very  vulnerable,  which  can 

be  destroyed  by  only  removing  1  nodes  from  the  power  network.  This  is  because 

the  nodes  of  lower  degree  in  communication  networks  are  very  easy  to  be  failed, 

which,  immediately,  cause  the  failures  to  their  coupled  nodes  of  higher  degree  in 

power  networks.  When  many  high  degree  nodes  are  removed,  the  network  is  easy 

to  be  fragmented  which  leads  to  the  destruction  of  the  whole  system  shortly.  The 

interdependent  systems  with  the  other  two  schemes,  RPDCC  and  RNDCC,  illustrate 

their  robustness  between  those  using  SDOC  and  RDOC,  due  to  the  random  factors 

in  RPDCC  and  RNDCC.  Compared  with  RNDCC,  systems  coupled  by  RPDCC  is 

almost  twice  more  robust  because  of  its  positive  correlations.  These  results  point  out  an 

important  principle  that  the  higher  correlation  between  the  degrees  of  coupled  nodes, 

the  stronger  the  interdependent  system  is.  In  other  words,  a  node  of  high  degree  in 

one  network  should  not  be  coupled  with  a  node  of  low  degree  in  the  other  network; 

otherwise,  this  node  will  be  a  weak  point  to  attack. 

3.5  RPDCC  /  RNDCC  Coupling  Schemes 
In  this  section,  we  present  the  RPDCC  scheme  to  randomly  couple  two  networks 
with  positive  degree  correlation.  Given  two  network  Gs  and  Gc,  we  form  two  weighted 
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Figure  3-6.  Vulnerability  Comparison  using  Different  Coupling  Schemes 


Algorithm  9  Random  weighted  permutation 
Input:  A  weighted  set  of  n  elements  X  =  {xltx2, ... ,  xn}  with  weights  w(-) 
Output:  Weighted  random  permutation  Y  of  X. 

total  4-  w(xi) 
for  /  =  1  to  n  do 

e  4-  a  random  selected  element  in  X  with  probability  w(e)/ total 
Y[i]  4—  e;  X  X  \  {e};  total  4—  total  —  w(e) 

end  for 

Return  Y 


sets  that  contain  vertices  of  Gs  and  Gc  as  elements  and  their  degrees  as  weights.  Then 

we  generate  two  random  weighted  permutations  { vf ,  vf _ vf}  and  { vf ,  v2c' _ vf}  of 

nodes  in  Gs  and  Gc  as  described  in  in  Algorithm  9,  then  vf  is  coupled  with  vf,  1  <  /  <  n. 
In  the  following  theorem,  we  show  that  a  node  of  larger  weight  has  smaller  expected 
index  in  each  permutation,  that  is,  nodes  of  high  degrees  in  two  permutations  tend  to 
have  low  indices.  In  other  words,  this  results  in  the  positive  correlation  between  degrees 
of  coupled  nodes.  (For  RNDCC,  we  couple  vf  with  vf_r) 

Theorem  3.4.  In  the  random  weighted  permutation,  an  element  with  bigger  weight  has 
lower  expected  index  than  an  element  with  smaller  weight. 
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Proof.  Let  E(X,  e)  be  the  expected  index  of  an  element  e  in  the  random  weighted 
permutation.  Then,  we  have: 


E(X,e) 


w(e) 


+  E 

zSX\{e} 


w(z) 


Exex  w(*)  Exex  w(x) 


(l  +  E(X\{z},e) 


Therefore,  E(X,  ei)  <  E(X,  e2)  if  w(ei)  >  w(e2). 


□ 


3.6  Related  Works 

Most  of  works  on  network  vulnerability  assessment  are  studied  in  single  networks 
[8,  10,  21, 29,  41],  The  centrality  measurements  [11]  are  widely  used,  including  degree, 
betweenness  and  closeness  centralities,  average  shortest  path  length  [5],  global 
clustering  coefficients  [39].  Alternatively,  Arulselvan  et  al.  [8]  first  proposed  the  total 
pairwise  connectivity  as  an  effective  measurement,  based  on  which  they  propose 
the  CND  problem  and  designed  a  heuristic  to  detect  critical  nodes.  The  /5-disruptor 
problem  was  later  defined  by  Dinh  et  al.  [25]  followed  by  pseudo-approximation 
algorithms.  Unfortunately,  these  approaches  fail  to  accurately  identify  the  critical  nodes 
on  interdependent  networks. 

Recently,  vulnerability  assessment  of  interdependent  networks  was  initiated  by 
Buldyrev  et  al.  [15],  and  followed  by  a  set  of  related  papers  [15,  27,  32,  45,  48].  These 
works  validated  the  size  of  largest  connected  component  as  an  effective  metric  for 
cascading  failures,  covering  a  wide  range  of  the  random  failures  [15],  order  percolation 
phase  transition  [16,  27,  45]  and  exploitation  of  robustness  under  targeted  attacks  [32]. 
Their  results  illustrated  that  interdependent  networks  are  much  more  vulnerable  than 
single  networks.  Unfortunately,  these  works  heavily  depend  on  configuration  models 
and,  therefore,  not  applicable  to  real-world  networks.  And  none  of  them  proposed  a 
strategy  to  identify  top  critical  nodes  in  interdependent  networks. 

3.7  Summary 

In  this  chapter,  we  studied  the  optimization  problem  of  detecting  critical  nodes  to 
assess  the  vulnerability  of  interdependent  power  networks  based  on  the  well-accepted 
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cascading  failure  model  and  metric,  the  size  of  largest  connected  component.  We 
showed  its  NP-hardness,  along  with  its  inapproximability.  Due  to  its  intractability, 
we  proposed  a  greedy  framework  with  various  novel  centralities,  which  measures 
the  importance  of  each  node  more  accurately  on  interdependent  networks.  The 
extensive  experiment  not  only  illustrates  the  effectiveness  of  our  approaches  in  networks 
with  different  topologies  and  interdependencies,  but  also  reveals  several  important 
observations  on  interdependent  power  networks. 
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CHAPTER  4 

INFLUENCE  DIFFUSION  IN  MULTIPLE  ONLINE  SOCIAL  NETWORKS 

In  the  recent  decade  the  popularity  of  online  social  networks,  such  as  Facebook, 
Google+,  Myspace  and  Twitter  etc.,  has  created  a  new  major  communication  medium 
and  formed  a  promising  landscape  for  information  sharing  and  discovery.  On  average, 
Facebook  users  spend  7h:45  per  person  per  month  [4];  3.2  billion  likes  and  comments 
are  posted  every  day  on  Facebook  [3];  340  million  tweets  are  sent  out  everyday 
on  Twitter  [4],  Such  engagement  of  online  users  fertilizes  the  land  for  information 
propagation  to  a  degree  never  achieved  before  in  mass  media.  More  importantly,  OSNs 
also  inherit  one  of  the  major  properties  of  real  social  networks-  the  word-of-mouth  or 
peer-pressure  effect  in  which  an  individual’s  opinion  or  decision  is  influenced  by  his 
friends  and  colleagues.  Due  to  the  considerable  impact  of  this  effect  on  the  popularity 
of  new  products  [14,  28],  OSNs  have  rapidly  become  one  of  the  most  attractive  choices 
to  rising  the  awareness  of  new  products  or  brands  as  well  as  to  reinforce  the  connection 
between  customers  and  companies.  The  crucial  problem  is  how  to  find  the  smallest  set 
of  influencers  who  can  influence  a  massive  number  of  users. 

There  is  a  considerable  number  of  overlapping  users  among  multiple  OSNs  which 
creates  a  huge  effect  on  the  diffusion  of  information  in  these  networks.  When  a  user 
joins  multiple  networks,  s/he  can  relay  the  information  from  one  network  to  another.  Let 
us  consider  the  following  typical  scenario  to  illustrate  this  phenomenon.  Jack,  a  user 
of  both  Twitter  and  Facebook,  logs  in  Twitter  and  knows  about  an  excellent  product 
from  his  friend.  He  right  away  falls  in  love  with  the  new  product  and  eagerly  shares  the 
information  by  tweeting  it.  Moreover,  he  configured  his  Twitter  and  Facebook  accounts 
as  illustrated  in  Fig.  4-1  that  allows  him  to  automatically  post  on  his  Facebook’s  wall 
whenever  he  has  a  new  tweet  and  vice  versa.  As  the  consequence,  the  product 
information  is  exposed  to  his  friends  in  both  networks  and  the  information  further 
spreads  out  on  both  the  networks.  If  we  only  consider  the  information  propagation  in  one 
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network,  the  reach  of  the  information  is  estimated  incorrectly  and  thus  the  influence  of 
users  in  these  networks.  In  this  case,  the  influence  of  Jack  should  be  the  combination 
of  his  influence  in  both  networks.  As  shown  in  Fig.  4-2,  the  fraction  of  overlapping 
users  is  considerable,  therefore  studying  the  problem  only  in  one  network  provides  a 
solution  which  is  quite  different  from  reality.  This  provides  the  motivation  to  study  the 
above  problem  on  multiple  networks  where  the  influence  of  users  is  evaluated  based  on 
multiple  OSNs  in  which  they  participate. 

Nearly  all  the  existing  works  studied  different  variants  of  the  massive  influence 
problem  on  a  single  network  [1 8,  1 9,  34,  35,  37,  52,  61 , 62],  Kempe  et  al.  [34]  first 
formulated  the  influence  maximization  problem  which  asks  to  find  a  set  of  k  users 
who  can  maximize  the  influence.  The  influence  is  propagated  based  on  a  stochastic 
process  called  Independent  Cascade  Model  (1C)  in  which  a  user  will  influence  his  friends 
with  probability  proportional  to  the  strength  of  their  friendship.  The  author  proved  that 
the  problem  is  NP-hard  and  proposed  a  greedy  algorithm  with  approximation  ratio  of 
(1  -  1/e).  After  that,  a  considerable  number  of  works  study  and  design  new  algorithms 
for  the  problem  variants  on  the  same  or  extended  models  such  as  [18,  35].  There  are 
also  works  on  the  linear  threshold  (LT)  model  for  influence  propagation  in  which  a 
user  will  adopt  the  new  product  when  the  total  influence  of  his  friends  surpass  some 
threshold.  Feng  et  al.  [62]  showed  NP-completeness  for  the  problem  and  Dinh  et  al.  [24] 
proved  the  inapproximability  as  well  as  proposed  efficient  algorithms  for  this  problem  on 
a  special  case  of  LT  model.  In  their  model,  the  influence  between  users  are  uniform  and 
a  user  is  influenced  if  a  certain  fraction  p  of  his  friends  are  active. 

Recently,  researchers  have  started  to  explore  multiple  networks  with  works  of  Yagan 
et  al.  [58]  and  Liu  et  al.  [38]  which  study  the  connection  between  offline  and  online 
networks.  The  first  work  investigates  the  outbreak  of  information  using  the  SIR  model 
on  random  networks.  The  second  one  analyzes  networks  formed  by  online  interactions 
and  offline  events.  The  authors  focus  on  understanding  the  flow  of  information  and 
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Figure  4-2.  The  number  of  shared  users  between  major  OSNs  in  2009  [2] 
network  clustering  but  not  solving  the  massive  influence  problem.  But  these  works  do 
not  study  any  specific  optimization  problem  of  viral  marketing.  Shen  et  al.  [49]  explore 
the  information  propagation  on  multiple  online  social  networks  taking  into  account  the 
interest  and  engagement  of  users.  In  their  solution,  all  networks  are  combined  into  one 
network  by  representing  an  overlapping  user  as  a  super  node.  This  method  cannot 
preserve  the  individual  networks’  properties. 

In  this  chapter,  we  study  the  Massive  Influence  problem  (MIP)  which  asks  for  a 
set  of  users  with  minimum  cardinality  to  influence  a  certain  fraction  of  users  in  multiple 
networks.  Suppose  that  we  know  the  participant  of  users  overall  networks,  we  exploit 
the  additional  information  of  overlapping  users  to  identify  top  influent  ones  over  multiple 
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networks.  Although  the  problem  has  been  studied  in  a  single  network  but  those  are 
only  special  cases  such  as  uniform  influence  between  users  in  [24],  etc.  In  addition,  the 
overlapping  users  introduce  several  new  challenges,  so  the  previous  solutions  cannot 
be  easily  adopted.  For  example,  how  to  evaluate  the  influence  of  overlapping  users 
across  multiple  networks?  In  which  network,  a  user  is  easier  to  be  influenced?  We 
introduce  novel  coupling  schemes  which  combine  multiple  networks  into  one  network 
while  retaining  the  influential  properties  of  the  original  networks  partially  or  fully.  After 
coupling  the  networks,  we  can  exploit  existing  solutions  on  the  single  network  to  solve 
the  problem.  This  is  a  powerful  and  comprehensive  procedure  to  study  MIP.  Moreover, 
we  propose  a  new  metric  called  influence  relay  to  analyze  the  flow  of  influence  between 
networks.  Through  comprehensive  experiments,  we  discover  crucial  properties  of  the 
multiple  networks  in  diffusing  the  information. 

The  chapter  is  organized  as  follows.  In  Section  4.1 ,  we  present  the  influence 
propagation  model  on  multiple  network  and  define  the  problem.  We  then  introduce  the 
method  to  align  nodes  in  networks  in  Section  4.2.  After  that,  we  introduces  different 
coupling  schemes  in  Section  4.3  and  4.4.  We  next  present  the  influence  relay  to  study 
the  influence  propagation  process  in  Section  4.5.  Section  4.6  shows  experimental 
results.  In  addition,  we  present  coupling  schemes  for  two  stochastic  cascading  models 
in  Section  4.7.  Finally,  we  summarize  the  chapter  in  Section  4.8. 

4.1  Network  Model  and  Problem  Definition 
4.1.1  Graph  Notations 

We  consider  k  networks  G\  G2, ... ,  Gk,  each  of  which  is  modeled  as  a  weighted 
directed  graph  G'  =  {V‘,  E',  O',  W').  The  vertex  set  V'  —  {u’s}  represents  the 
participation  of  n'  =  \  V'\  users  in  the  network  G',  and  the  edge  set  E'  =  {{u,  v)’s} 
contains  m'  =  \E'\  oriented  connections  (e.g.,  friendships  or  relationships)  among 
network  users.  1/1/'  =  {w'(u,  v)’s}  is  the  (normalized)  weight  function  associated  to 
all  edges  in  the  ith  network.  In  our  model,  weight  w'(u,  v)  can  also  interpreted  as  the 
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strength  of  influence  (or  the  strength  of  the  relationship)  a  user  u  has  on  another  user  v 
in  the  ith  network.  The  sets  of  incoming  and  outgoing  neighbors  of  vertex  u  in  network 
G'  are  denoted  by  N‘~  and  A/'+,  respectively.  In  addition,  each  user  u  is  associated  with 
a  threshold  9'(u )  indicating  the  persistence  of  his  opinions.  The  higher  9'(u )  is,  the  more 
unlikely  that  u  will  be  influenced  by  the  opinions  of  his  friends.  Furthermore,  the  users 
that  actively  participate  in  multiple  networks  are  referred  to  as  overlapping  users.  Those 
users  are  considered  as  bridge  users  for  information  propagation  across  networks. 
Finally,  we  denote  by  G1 -k  the  system  consisting  of  k  networks,  and  by  U  the  exhaustive 
set  of  all  users  U  =  u j=1V'. 

4.1.2  Influence  Propagation  Model 

We  first  describe  the  linear  threshold  model  (LT-model)  [24,  62],  a  popular  model 
for  information  and  influence  diffusion  in  a  single  network,  and  then  discuss  how  this 
LT  model  can  be  extended  to  cope  with  multiple  networks.  In  the  original  LT  model, 
each  network  user  u  is  either  in  an  active  or  inactive  state:  u  is  in  an  active  state  if  he 
originally  adopts  the  information,  or  the  total  influence  from  his  direct  neighbors  exceeds 
his  threshold  9(u),  i.e,  J2V£N(u)  w(v,  u )  >  0(u)-  Otherwise,  u  is  in  an  inactive  state. 

In  a  big  picture,  given  a  system  of  k  networks,  the  information  is  propagated 
separately  in  each  network  and  can  only  be  transferred  from  one  network  to  another 
via  the  overlapping  users  of  these  networks  The  information  starts  to  spread  out 
from  set  of  seed  users  S  i.e.  all  users  in  S  have  active  state  and  the  remaining  users 
are  inactive.  At  time  t,  a  user  u  becomes  active  if  the  total  influence  from  its  active 
neighbors  surpasses  its  threshold  in  some  network  i.e.  there  exist  /  such  that: 

wi(v,u)>9i(u) 

v£N'u~  ,ve  A 

where  A  is  the  set  of  active  users  after  time  (t  -  1). 

After  each  time  step,  new  inactive  users  are  activated  and  they  continue  to  activate 
other  users.  The  process  will  continue  until  there  are  no  more  inactive  users  can  be 
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activated.  If  we  limit  the  propagation  time  to  d,  then  the  process  will  stop  after  t  =  d 
time  steps.  The  set  of  active  users  caused  by  the  seed  set  S  after  time  d  is  denoted 
as  Ad(G1-k,  S ).  Note  that  d  is  also  the  number  of  hops  in  networks  up  to  which  the 
influence  can  be  propagated  from  the  seed  set,  so  d  is  called  the  number  of  propagation 
hops. 

4.1.3  Problem  Definition 

In  this  chapter,  we  address  the  fundamental  vulnerability  problem  on  multiple 
networks:  the  Massive  Influence  problem.  The  problem  asks  to  find  a  seed  set  of 
minimum  cardinality  which  influences  a  large  fraction  of  users,  formally  defined  as 
follows. 

Definition  4.  (Massive  Influence  Problem  (MIP))  Given  a  system  of  k  networks  Gl-k 
with  the  set  of  users  U,  a  positive  integer  d,  and  0  <  f3  <  1,  the  MIP  problem  asks  to  find 
a  seed  set  5  c  U  of  minimum  cardinality  such  that  the  number  of  active  users  after  d 
hops  according  to  LT  model  is  at  least  (3  fraction  of  users  i.e.  \  Ad(G1-k,  S)\  >  6\U\. 

When  k  —  1,  we  have  the  variant  of  the  problems  on  a  single  network  which 
NP-hard  to  solve  [17]  but  it  is  easier  to  design  heuristic  algorithm  on  the  single  network. 
In  next  sections,  we  present  different  coupling  strategies  to  reduce  the  problem  on 
multiple  networks  to  one  on  a  single  network  in  order  to  utilize  the  algorithm  designing. 

4.2  Network  Alignment 

We  first  reassign  a  universal  identification  (id)  to  each  node  in  the  networks  such 
that  all  overlapping  nodes  of  the  same  user  have  the  same  id.  Each  network  topology 
often  uses  its  own  system  for  naming  nodes,  thus  a  person  may  have  different  ids 
in  different  networks.  As  a  consequence,  it  requires  a  complicated  mechanism  and 
extra  effort  repeatedly  to  keep  updating  the  user  states  across  networks.  We  ease  this 
burden  by  assigning  a  unique  id  to  each  user  and  using  it  as  the  node  id  in  all  networks. 
However,  if  we  trivially  assign  new  id  to  an  unassigned  node  and  its  overlapping  nodes 
one  by  one,  we  need  to  scan  all  the  overlap  mappings  each  time  which  is  almost 
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impractical  in  large  networks.  Thus,  we  need  to  design  an  algorithm  which  assigns  new 
ids  in  only  linear  time. 

Our  goal  is  to  scan  each  overlap  mapping  and  check  each  node  once  to  assign 
new  ids.  Instead  of  assigning  ids  to  all  overlapping  nodes  of  a  user  at  the  same  time, 
we  check  whether  one  of  its  overlapping  nodes  is  already  assigned  an  id  or  not.  To  be 
specific,  we  process  each  network  in  two  phases:  assign  ids  to  nodes  in  its  mapping 
lists  with  the  processed  networks  and  then  assign  new  ids  to  the  remaining  nodes.  This 
method  guarantees  the  validity  of  new  ids  as  stated  in  Lemma  3.  The  algorithm,  as 
described  in  Algorithm  10,  scans  each  mapping  and  checks  each  node  once,  so  the 
total  running  time  is  linear  in  the  total  size  of  networks  and  overlap  mappings. 


Algorithm  10  Node-Alignment  Algorithm 

Require:  k  networks  G1 -k  and  overlap  mappings  {Q}. 

Ensure:  A  new  id  mapping  for  nodes  in  G1 -k. 

1 :  newid  0 

2:  Initialize  id  mapping  M 

3:  for  /  =  1  to  k  do 

4:  for  j  =  1  to  /  —  1  do 

5:  for  each  pair  (u,  v )  e  Cy  do 

6:  M[u\  =  M[v] 

7:  end  for 

8:  end  for 

9:  for  each  u  e  V1  do 

10:  if  u  is  unassigned  then 

11:  M[u\  —  newid 

12:  newid  newid  +  1 

13:  end  if 

14:  end  for 

15:  end  for 
16:  Return  M 


Lemma  3.  Node-Alignment  Algorithm  assigns  the  same  id  to  all  overlapping  nodes  of 
the  same  user  and  different  ids  to  nodes  of  different  users. 
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Proof.  Let  u'\  u12, ... ,  u'1  be  overlapping  nodes  of  user  u  in  networks  i1  <  i2  <  ...  <  /'/,  then 
M[u'']  =  Miu1'-1]  =  ...  =  M[uk]  due  to  the  line  6  of  the  algorithm.  Thus,  all  overlapping 
nodes  of  user  u  have  the  same  id. 

Next,  consider  two  overlapping  nodes  u'  and  vJ  of  users  u  and  v.  Denote  u'°  ( vJO ) 
as  overlapping  nodes  of  u  ( v )  in  network  /0  (y0)  with  the  smallest  index.  Without  loss  of 
generality,  suppose  that  u‘°  is  assigned  an  id  after  vJ0.  Due  to  the  property  of  u'°,  it  does 
not  appear  in  any  overlap  mapping  with  previous  networks,  hence  it  is  assigned  with  a 
new  id.  Thus,  u'°  and  vJ0  have  different  ids;  so  are  u'  and  vJ.  □ 

4.3  Lossless  Coupling  Schemes 

In  this  section,  we  present  two  schemes  to  couple  multiple  networks  into  a  new 
single  network  with  respect  to  the  influence  diffusion  process  on  each  participant 
network.  A  notable  advantage  of  this  newly  coupled  graph  is  that  we  can  use  any 
existing  algorithm  on  a  single  network  to  produce  the  solution  on  multiple  networks 
with  the  same  quality.  Unfortunately,  we  encounter  a  series  of  the  challenging  issues  in 
designing  such  coupling  schemes: 

(1 )  The  heterogeneity  of  user  participation.  A  user  may  join  in  one,  two,  or  more 
networks.  How  can  we  recognize  and  differentiate  these  users  in  the  coupled 
network?  How  to  capture  the  roles  of  each  user  on  the  multiple  networks? 

(2)  The  process  of  information  and  influence  propagation  among  networks.  In  multiple 
networks,  when  a  user  is  influenced  he  tends  to  immediately  propagate  the 
information  on  all  networks  that  he  is  a  part  of.  How  can  we  describe  this 
immediate  transmission  of  the  information  between  networks  in  just  a  single 
network? 

(3)  Preserving  properties  of  individual  networks.  The  coupled  network  should  be  a 
good  representative  of  all  the  individual  networks.  It  should  preserve  the  diffusion 
properties  of  all  the  networks.  That  enable  us  to  establish  the  relationship  between 
solution  of  the  problem  on  the  coupled  network  and  the  original  networks.  How  can 
we  design  a  coupling  scheme  that  addresses  this  issue? 

Next,  we  present  the  method  to  overcome  these  challenges.  The  first  issue  is 
solved  by  introducing  dummy  nodes  for  each  user  u  in  networks  that  it  does  not  belong 
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to.  These  dummy  nodes  are  isolated.  Now  the  vertex  set  V'  of  ith  network  can  be 

represented  by  V1  =  {u[,  u \ _ u'n}  where  U  =  {ult  u2 _ un}  is  the  set  of  all  users.  In 

the  new  representation,  there  is  an  edge  from  u'p  to  u'q  if  up  and  uq  are  connected  in  G'. 
Now  we  can  union  all  k  networks  to  form  a  new  network  G.  The  approach  to  overcome 
the  second  challenge  is  to  allow  nodes  u1,  u2, ,  uk  of  an  user  u  to  influence  each  other 
e.g.  adding  edge  (V,  it1)  with  weight  9{Lk).  When  u1  is  influenced,  u>  is  also  influenced  in 
the  next  time  step  as  they  are  actually  a  single  overlapping  user  u,  thus  the  information 
is  transferred  from  network  G'  to  GJ.  But  an  emerged  problem  is  that  the  information  is 
delayed  when  it  is  transferred  between  two  networks.  Right  after  being  activated,  u1  will 
influence  its  neighbors  while  Lk  needs  one  more  time  step  before  it  starts  to  influence 
its  neighbors.  It  would  be  better  if  both  u1  and  tk  start  to  influence  their  neighbors  in  the 
same  time.  For  this  reason,  new  gateway  node  u°  is  added  to  G  such  that  both  u'  and 
ik  can  only  influence  other  nodes  through  u°.  In  particular,  all  edges  (V,  v')  (( Lk ,  zJ))  will 
be  replaced  by  edges  (u°,  v‘)  (( u° ,  zJ)).  In  addition,  more  edges  are  added  between  u°, 
u',  and  ik  to  let  them  influence  each  other.  We  describe  the  coupling  schemes  next  and 
how  we  can  couple  the  multiple  networks  preserving  their  individual  properties. 

4.3.1  Clique  Lossless  Coupling  Scheme 

Given  k  networks  G1,  G2,  ...,  Gk  with  the  set  of  users  U,  we  construct  a  new  graph 
G  —  (V,  E,6,  w)  as  follows. 

Firstly,  we  add  dummy  vertices  of  threshold  1  to  all  these  networks  and  include  all 

nodes  into  vertex  set  V  together  with  gate  way  nodes  \/  =  uf=1  V1  u  {u%,  u% _ u°n}. 

In  the  new  vertex  set,  u^,  u%, ... ,  u°n  represents  the  set  of  users  in  the  coupled  network 
and  are  called  user  vertices.  Vertex  u'p  called  the  account  vertex  of  user  up  in  G'.  The 
thresholds  of  the  former  type  of  vertices  are  set  to  9(  u°p )  =  1,  1  <  P  <  n,  and  the 
thresholds  of  the  later  type  vertices  are  kept  the  same  with  the  one  in  multiple  networks 
i.e.  0(u')  =  O^up). 
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Secondly,  we  represent  the  influence  of  user  u  on  user  v  in  network  G'  by  the 
influence  of  user  vertex  u°  on  the  account  vertex  of  v  in  G'.  It  means  that  if  there  is  an 
edge  between  user  u  and  v  in  G‘,  then  an  edge  from  u°  to  v1  with  weight  w(u°,  v')  — 
w‘(u,  v )  is  added  to  the  edge  set  E. 

Finally,  we  connect  user  vertex  and  account  vertices  of  the  same  user  to  guarantee 
that  they  have  same  activation  states.  The  goal  is  that  if  one  of  these  nodes  is  active,  it 
will  activate  all  other  nodes.  It  can  be  done  by  adding  extra  edges  called  synchronization 
edges  between  these  nodes  whose  weights  equal  to  the  thresholds  of  destination 
nodes.  Specifically,  w(u',  U)  =  6>(iV),  VO  <  ij  <  k,  i  ^  j.  These  synchronization 
edges  form  a  clique  between  nodes,  thus  this  coupling  scheme  is  named  clique  lossless 
coupling  scheme.  A  simple  example  of  the  scheme  is  illustrated  in  Figure  4-3. 

Next  we  will  show  that  the  propagation  process  in  the  original  multiple  networks  and 
the  coupled  network  is  actually  the  same.  Influence  is  alternatively  propagated  between 
user  and  account  vertices,  so  problem  with  d  hops  in  the  multiple  networks  is  equivalent 
to  problem  with  2d  hops  in  the  coupled  network. 

Lemma  4.  Suppose  that  the  the  propagation  process  on  the  coupled  network  G  starts 
from  the  seed  set  which  contains  only  user  vertices  S  —  {s°, ... ,  s°},  then  user  vertices  is 
only  activated  in  even  propagation  hops. 

Proof.  Suppose  that  a  user  vertex  u°  is  the  first  user  vertex  that  is  activated  at  the  odd 
hops  2d  +  1.  u°  must  be  activated  by  some  vertex  u'  and  u‘  is  the  the  first  activated 
vertex  among  vertices  u1,  u2, ... ,  uk.  It  means  that  u'  is  activated  in  hop  2d.  Since  all 
incoming  neighbors  of  u‘  is  user  vertices,  some  user  vertex  changes  its  status  to  active 
in  hop  2d  -  1.  It  is  contradicted.  □ 

Lemma  5.  Suppose  that  the  the  propagation  process  on  G1 -k  and  G  starts  from  the 
same  seed  set  S,  then  following  conditions  are  equivalent: 

( 1 )  User  u  is  active  after  d  propagation  hops  in  G1  V 
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A  Multiple  networks  G1,  G2,  G3  with  6  users  where 
vertices  with  the  same  color  represent  the  same  user. 
Each  user  may  have  different  thresholds  in  different 
networks,  e.g.,  red  user  has  thresholds  of  0.6,  0.3,  and 
0.8. 


user  vertex 


account  vertex 


dummy 
account  vertex 


B  The  influence  between  users  in  multiple  networks  are  en¬ 
coded  by  the  influence  from  user  vertices  to  account  vertices. 
Dummy  account  vertices  are  added  to  guarantee  that  all  users 
have  the  same  number  of  account  vertices. 


C  Clique  Synchronization 

Figure  4-3.  An  example  of  lossless  coupling  scheme 
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(2)  There  exists  i  such  that  u‘  is  active  after  2d  -  1  propagation  hops  in  G. 

(3)  Vertex  u°  is  active  after  2d  propagation  hops  in  G. 

Proof.  We  will  prove  this  lemma  by  induction.  Suppose  it  is  correct  for  any  1  <  d  <  t, 
we  need  to  prove  it  is  correct  for  d  =  t  +  1.  Denote  /41 ■■■*(  t)  and  A{ t)  as  the  set  of  active 
users  and  active  vertices  after  t  propagation  hops  in  G1-*  and  G,  respectively. 

(1)  =>■  (2):  If  user  u  is  active  at  time  t  +  1  in  G1-*,  it  must  be  activated  at  some 
network  GJ.  We  have: 

wJ(v,  u )  >  (T{u) 

vewi_n  Alk(t) 

Due  to  the  induction  assumption,  for  each  v  e  /A1  A:( t),  we  also  have  v°  e  >4(2 1)  in 
G.  Thus: 

^  w(v°,  uJ)  =  Yl  wJ(v’  u )  >  W  =  W) 

v°eN~nA{2t)  veN^nA1- k(t) 

It  means  that  u1  is  active  after  (2(t  +  1)  -  1)  propagation  hops. 

(2)  =>  (3):  If  there  exists  /  such  that  u1  is  active  after  2(t  +  1)  -  1  propagation  hops 
on  G,  then  u1  will  activate  u°  in  hop  2 (t  +  1) 

(3)  =>  (1):  Suppose  that  u°  S  is  active  after  2 (t  +  1)  propagation  hops  in  G,  then 
there  must  exists  iT  which  activates  tv°  before.  This  is  equivalent  to: 

w(v,  d)  >  9(u*) 

veN~,veA(2t ) 

For  each  v  e  >4(2t),  we  also  have  v  e  >41-k(t).  Replace  this  into  the  above 
inequality  we  have: 

y^  wj(v,  u)  =  w(v°,  /T) 

veN^nA1 k(t)  v°eN~nA(2t) 

>  eiih)  =  ^(u) 

Thus,  u  is  active  in  network  GJ  after  t  +  1  propagation  hops.  □ 
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Next,  we  will  show  that  the  number  of  influenced  vertices  in  the  coupled  networks 
is  always  (k  +  1)  times  the  number  of  influenced  users  in  multiple  networks  as  stated  in 
Theorem  4.1 . 

Theorem  4.1.  Given  a  system  of  k  networks  G1-*  with  the  user  set  U,  the  coupled  net¬ 
work  G  produced  by  the  lossless  coupling  scheme,  and  a  seed  set  S  =  {sl7  s2 _ sp},  if 

Ad(G1-k,  S )  =  {al7  a2,  ... ,  aq}  is  the  set  of  active  users  caused  by  S  after  d  propagation 

hops  in  multiple  networks,  then  A2d(G,  S)=  {a?,  a} _ a\, ...,  a°,  a\ _ akq}  is  the  set  of 

active  vertices  caused  by  S  after  2d  propagation  hops  in  the  coupled  network. 

Proof.  For  each  user  a,  e  Ad(G1-k,  S )  i.e.  a,  is  active  after  d  hops  in  G1 ■\  then 
there  exists  aJj  which  is  active  after  2d  —  1  hops  in  G  according  to  the  Lemma  5.  As 

a  consequence,  all  a°,  a} _ ak  are  active  after  2d  hops.  So  8  =  {a$\  a\ _ a\, ..., 

3°,a* . 3k}  C  A2d{G,  5). 

Let  consider  a  vertex  of  A2d(G,  S )  which  is: 

Case  1.  A  user  vertex  u°  which  is  active  after  2d  hops  in  G,  so  vertex  u  must  be 
active  after  d  hops  in  G1-*.  This  implies  u  e  Ad(G1-k ,  S),  thus  u°  e  B. 

Case  2.  An  account  vertex  u'.  If  u1  is  active  after  2d  -  1  hops,  then  u  must  be  active 
after  d  hops  due  to  Lemma  5,  thus  u  e  Ad(G1-k,  S ).  Otherwise,  u‘  is  activated  at  hop  2d 
,  it  must  be  active  by  some  vertex  ih,J  >  0  since  all  user  vertices  only  change  their  state 
at  even  hops.  Again,  u  e  Ad(G1-k,  S ).  This  results  in  u'  e  B. 

From  two  above  cases,  we  also  have  A2d(G,  S )  c  B.  So  that  A2d(G,  5 )  =  B,  the 
proof  is  completed.  □ 

Theorem  4.1  provides  the  basis  to  derive  the  solution  for  MIP  on  multiple  networks 
from  the  solution  on  a  single  network.  It  implies  an  important  algorithmic  property  of  the 
lossless  coupling  scheme  regarding  to  the  relationship  between  the  solutions  of  MIP  in 
Gl  -k  and  G.  The  equivalence  of  two  solutions  is  stated  below: 
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Theorem  4.2.  When  the  lossless  scheme  is  used,  the  set  S  —  {si,  s2, ... ,  sp}  influences 

/3  fraction  of  users  in  G1-k  after  d  propagation  hops  if  and  only  if  S'  =  { sj3 ,  s£ _ s°} 

influences  [3  fraction  of  vertices  in  coupled  network  G  after  2d  propagation  hops. 

Size  of  the  coupled  network.  The  size  of  the  coupled  network  can  be  computed 
from  the  sizes  of  the  original  networks  as  follows: 

Proposition  4.1.  When  the  lossless  scheme  is  used,  the  coupled  network  has  \  V\  — 

(k  +  1)\U\  =  (k  +  1  )n  vertices  and\E\  =  J2,=i  \E'\  +  nk(k  +  1)  edges. 

Proof.  In  the  coupling  scheme,  each  user  u  has  k  + 1  corresponding  vertices  u,  u1, ... ,  uk 
in  the  coupled  network,  thus  the  number  of  vertices  \s\V\  =  (k  +  1)|  U\  =  (k  +  1  )n. 

The  number  of  edges  equals  the  total  number  of  edges  from  all  input  networks  plus 
the  number  of  new  edges  for  synchronizing.  Thus  the  total  number  of  edges  is  \E\  — 

S/Li  \E'\  +  nk(k  +  1).  □ 

4.3.2  Star  Lossless  Coupling  Scheme 

In  clique  lossless  coupling  scheme,  the  number  of  edges  to  synchronize  the  state  of 
vertices  u°,  u1, ... ,  uk  is  k(k+ 1)  for  each  user  u,  which  results  in  nk(k  + 1)  extra  edges  in 
the  coupled  network.  In  real  networks,  the  number  of  edges  is  often  linear  to  the  number 
of  vertices,  so  the  number  of  extra  edges  considerably  increases  the  size  of  the  coupled 
network,  especially  when  k  is  large.  We  would  like  to  design  a  synchronization  strategy 
that  reduces  these  extra  edges. 

Note  that  the  large  number  of  extra  edges  is  due  to  the  direct  synchronization 
between  every  pairs  of  account  vertices  of  u  in  clique  lossless  coupling  scheme,  so 
we  can  save  some  edges  by  using  indirect  synchronization.  We  create  one  more 
intermediate  vertex  uk+1  with  threshold  6(uk+1 )  =  1  and  let  the  active  state  propagate 
from  any  vertex  in  u1,  u2, ... ,  uk  via  this  vertex.  Specifically,  the  synchronization  edges 
are  established  follows:  w(u‘,  uk+1 )  =  1  and  w(uk+1,  u')  —  O(u')  1  <  /  <  k\  w(uk+1,  u°)  — 
w(u°,  uk+1 )  =  1.  The  synchronization  strategy  of  star  lossless  coupling  scheme  is 
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Figure  4-4.  Star  Synchronization 


illustrated  in  Fig.  4-4.  Now,  the  number  of  extra  edge  for  each  user  is  2 (k  -l- 1)  and  the 
size  of  the  coupled  network  is  reduced  to: 

Proposition  4.2.  When  star  lossless  scheme  is  used,  the  coupled  network  has  |  \/|  = 

(k  +  2)\U\  =  (k  +  2 )n  vertices  and\E\  =  \E'\  +  2n(k  +  1)  edges. 

In  star  lossless  coupling  scheme,  it  takes  2  hops  to  synchronize  the  states  of 
account  vertices  of  each  user  which  leads  to  delaying  the  propagation  of  influence  in  the 
coupled  network.  Due  to  the  similarity  between  star  lossless  scheme  and  clique  lossless 
scheme,  we  state  the  following  property  of  star  lossless  scheme  without  proof. 

Theorem  4.3.  When  star  lossless  coupling  scheme  is  used,  the  set  5  =  {si,  s2 _ sp} 

influences  8  fraction  of  users  in  Gl---k  after  d  propagation  hops  if  and  only  if  S'  = 

{s£,  s% _ s°}  influences  8  fraction  of  vertices  in  coupled  network  G  after  3d  propagation 

hops. 

4.4  Lossy  Coupling  Schemes 

In  the  preceding  coupling  schemes,  a  complicated  coupled  network  is  produced 
with  a  large  number  of  auxiliary  vertices  and  edges.  It  is  ideal  to  have  a  coupled  network 
which  only  contain  users  as  nodes.  This  network  provides  a  compact  view  of  the 
relationship  between  users  crossing  the  whole  system  of  networks.  To  compact  the 
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information  which  is  completely  described  by  the  whole  system  into  one  network,  the 
loss  of  information  is  unavoidable.  The  goal  is  to  design  a  scheme  such  that  minimize 
the  loss  as  much  as  possible  i.e.  the  solution  for  the  problem  in  the  coupled  network  is 
very  closed  to  one  in  the  original  system.  Next,  we  present  such  scheme  based  on  the 
following  key  observations. 

Observation  1.  Consider  user  u,  u  will  be  activated  if  there  exists  /  such  that: 

wi(vi,ui)>6'(u) 

v'eN~:,vGA 

u1 

where  A  is  the  set  of  active  users. 

We  can  relax  the  condition  to  activate  u  with  positive  parameters  o T(u),  a2(u),  ..., 
ak(u)  as  follows: 

k  k 

w^v' u))  -  ^2a'(u)9'(u')  (4_1) 

'= 1  v'eN~,veA  ,=1 

u' 

Proposition  4.3.  Given  a  system  of  networks  Gx  -k,  if  the  condition  (4-1)  is  satisfied, 
then  user  u  is  activated. 

Proof.  When  the  condition  is  satisfied,  there  must  exist  /  such  that  a'(u)  Y.v^n-  vga  w'(v’  u )  > 

u1  ’ 

a'(u)9'(u).  As  a  consequence,  the  condition  to  activate  u  is  satisfied  since  a'(u)  >  0  □ 

Note  that  sometimes  the  condition  to  activate  u  is  met,  but  the  condition  (4-1)  is  still 
need  more  influence  from  tv’s  friends  to  satisfy.  The  more  this  extra  influence  need  is, 
the  looser  condition  (4-1)  is.  We  can  reduce  this  redundancy  by  increasing  the  value 
of  a'(u)  proportional  to  the  value  of  ^v,-eA/-  w'(v,  u )  -  9'(u).  In  the  special  case, 

u1  ’ 

^v'&N-,veA  w'(v>  u )  >  0'(u)  ancl  we  choose  a'(u)  »  at’iu),  V/  /  /,  then  there  is  no 
redundancy.  Unfortunately,  we  do  not  know  before  hand  in  which  network  user  u  will  be 
activated,  so  we  can  only  choose  parameters  heuristically. 

Observation  2.  When  user  u  participates  in  multiple  networks,  it  is  easier  to 
influence  u  in  some  network  than  the  others.  The  following  simple  case  illustrate 
such  situation.  Suppose  that  we  have  two  networks. In  network  1 ,  ^(u1)  =  0.1  and 
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1/8 


1/80  +  3/10  +  49/30  ny 
|  3/10  ~  U 

]Q  -88/90 


Figure  4-5.  Lossy  coupled  network  using  easiness  parameters.  The  number  of  edges  is 
much  less  than  the  lossless  coupled  network. 


u 1  has  8  in-neighbors,  each  neighbor  t/1  influences  u 1  with  w1^1,  u1)  —  0.1.  In 
network  2,  92(u2)  —  0.7  and  u2  has  8  in-neighbors,  each  neighbor  v2  influences  u2 
with  w2(v2,  u 2)  =  0.1.  The  number  of  active  neighbors  to  activate  u  is  1  and  7  in  network 
1  and  2,  respectively.  Intuitively,  we  can  say  that  u  is  easier  to  be  influenced  in  the  first 
network.  We  quantify  the  influence  easiness  e'(tv)  that  u  is  influenced  in  network  /  as  the 
ratio  between  the  total  influence  from  friends  and  the  threshold  to  be  influenced. 

,,  ,  =  w'jv'.W) 

{  ’  f)'(u') 

We  can  use  the  influence  easiness  of  a  user  in  networks  as  the  parameters  of  the 
condition  4-1 . 

Based  on  above  observations,  we  couple  multiple  networks  into  one  using 

parameters  {a'(u)}.  The  vertex  set  is  the  set  of  users  V  =  {ult  u2 _ un}.  The  threshold 

of  vertex  u  is  set  to: 

k 

0(  u)  =  Yia'(uW(W) 

i= 1 
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The  weight  of  the  edge  (v,  u)  is: 


k 

w(v,  u )  =  ^ a'(u)w'(vl,  u ') 

/= 1 

where  w'(v',  u')  —  0  if  there  is  no  edge  from  v'  to  u'  in  ith  network. 

Then  the  set  of  edges  is  E  =  {(*/,  u)\w(v,  u)  >  0}.  Fig.  4-5  illustrates  the  loosy 
coupled  network  of  the  system  of  network  in  Fig.  4-3. 

Besides  the  easiness,  other  metrics  can  be  used  with  the  same  purpose.  We 
enumerate  here  some  other  metrics. 

Involvement.  Nodes  can  be  part  of  multiple  social  networks,  but  typically  they  are 
more  involved  in  a  few  of  them  compared  to  others.  We  estimate  involvement  of  a  node 
v  in  a  network  G,  by  measuring  how  strongly  the  1-hop  neighborhood  v  is  connected 
and  to  what  extent  influence  can  propagate  from  one  node  to  another  in  the  1-hop 
neighborhood.  Formally  we  can  define  involvement  of  a  node  v  in  network  G,  as:- 

w'(x,  y) 

9‘ 

x,ye{N'(v)Uv}  y 

where  N‘(v)  is  the  set  of  all  neighbors  of  v  (both  in-coming  and  out-going),  w'(x,y) 
is  the  wt  of  edge  (x,  y)  and  9'y  is  the  threshold  of  y  in  G,. 

Average.  This  a  baseline  scheme  just  used  for  comparison  purposes.  We  just  take 
an  average  of  the  thresholds  and  edge-wts  over  all  the  networks,  in  which  v  belongs.  So 
average  of  a  node  v  in  network  /  can  be  defined  as 

a'  =  — - — 

v  \P(v)\ 

Next  we  show  the  relationship  between  the  solution  for  the  influence  maximization 
problem  in  the  lossy  coupled  network  and  the  original  system  of  networks.  As  discussed 
in  the  above  observations,  if  the  propagation  process  starts  from  the  same  set  of  users 
in  the  network  system  G1 -k  and  the  coupled  network,  then  the  active  state  of  a  user  in 
G  implies  its  active  state  in  G1 -k.  It  means  that  if  the  set  of  users  S  activates  /3  fraction 
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of  users  in  G,  it  also  activates  at  least  /3  fraction  of  users  in  G1~k.  We  have  the  following 
result. 

Theorem  4.4.  When  the  lossy  coupling  scheme  is  used,  if  the  set  of  users  5  activates  8 
fraction  of  users  in  G,  then  it  activates  at  least  / 3  fraction  of  users  in  Gl  -k. 

4.5  Influence  Relay 

We  propose  the  influence  relay  metric  to  quantify  the  role  of  users  in  propagating 
information.  When  the  information  is  diffused  in  multiple  networks,  the  information  may 
flow  within  a  single  network  or  go  through  two  or  more  networks.  This  brings  out  a  series 
of  concerns:  how  much  information  flows  inside  a  network?  how  much  information 
flows  from  one  network  to  another?  how  much  is  the  contribution  of  each  network  in 
the  influence  propagation?  Once  we  can  quantify  these  values,  we  can  get  insights  into 
the  influence  diffusion  process  in  multiple  networks.  Next,  we  define  the  influence  relay 
metric  and  related  concepts  to  measure  these  values. 

Since  we  can  use  a  single  network  to  simulate  the  diffusion  process  in  multiple 
networks,  we  first  can  measure  the  information  flowing  through  each  node  in  a  single 
network.  Suppose  that  the  information  breaks  out  from  the  seed  set  S  in  the  network 
G,  and  stops  after  d  hops  with  the  set  of  influenced  vertices  Ad(G,  S).  Intuitively,  the 
influence  relay  of  each  vertex  is  the  amount  of  influence  it  relays  to  other  nodes  after 
adopting  the  information.  The  more  number  of  vertices  it  helps  to  influence,  the  higher 
its  influence  relay  is.  In  addition,  if  it  has  strong  influence  on  a  node  with  high  relay 
influence,  it  should  also  have  high  value  of  relay  influence  even  it  does  not  directly 
influence  many  vertices.  For  these  reasons,  we  formally  propose  the  influence  relay 
metric  //?(•)  which  is  computed  iteratively  as  below. 

All  inactive  vertices  have  the  influence  relay  of  0. 

Each  active  vertex  v  without  activated  outgoing  neighbors  has  the  influence  relay 
of  1 .  v  does  not  activate  any  vertices,  but  it  contributes  itself  as  one  active  node  to 

Ad(G,S). 
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The  influence  relay  of  any  other  vertex  u  is  computed  based  on  the  influence  relay 


of  its  outgoing  active  neighbors.  Specifically,  the  influence  relay  of  u  is: 


IR(u) 


1  +  ^  w(u,  v)IR(v) 

veN+Ah(u)<h(v)  ^zeA/-' A h(z)<h(v)  w(z’  v ) 


(4-2) 


where  h(u)  is  the  hop  at  which  u  is  activated. 

In  this  formula,  the  influence  relay  of  u  is  the  total  influence  relay  of  vertices  which 
are  under  tv’s  influence.  However,  each  vertex  v  of  these  ones  is  under  the  influence  of 
many  vertices.  Among  them,  u  contributes  the  impact  of  w(tv,  v)  to  influence  v,  hence  u 
is  responsible  for  only  ^ - w{u,v)  ,  .  of  v’s  influence  relay.  Moreover,  we  add  1  to  the 

^zeN~  Ah(z)<h(v)  '  ’ 

influence  relay  of  u  since  u  also  contributes  itself  to  the  set  of  activated  vertices. 

We  next  present  an  efficient  algorithm  to  compute  the  influence  relay  of  all  vertices. 
Since  the  influence  relay  of  each  node  depends  on  its  out-going  neighbors  which  are 
activated  later  than  it,  we  need  to  compute  the  influence  relay  of  nodes  in  the  reverse 
order  of  the  diffusion  process.  We  can  construct  the  influence  graph  IGS  =  (Vs,  Es ) 
from  the  seed  set  S  to  represent  the  diffusion  process  and  compute  the  influence  relay 
of  all  nodes  in  Vs.  The  vertex  set  Vs  of  ns  nodes  is  Ad(G,  S ).  There  is  an  edge  from  u 
to  v  in  Es  with  the  same  weight  w(u ,  v)  in  G  if  u  has  passed  the  information  to  v,  i.e., 
u,  v  e  Ad(G,  S )  and  h(v)  >  h(u).  The  graph  IGS  is  a  directed  acyclic  graph,  thus  we  can 
compute  the  influence  relay  of  all  vertices  in  the  reverse  topological  ordering  of  IGS  as 
described  in  CIR  algorithm  (Algorithm  11). 

Proposition  4.4.  The  influence  graph  IGS  caused  by  the  seed  set  S  in  the  network  G  is 
a  directed  acyclic  graph. 


Proof.  If  IGS  has  a  cycle  ult  u2 _ ut,  ux,  then  /v(tvi)  <  h(u2)  <  ...  <  h(ut )  <  h(u i) 

(contradicted).  □ 
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Algorithm  11  Computing  Influence  Relay  (CIR) 

Require:  A  network  G,  a  seed  set  S  and  the  number  of  hops  d. 
Ensure:  The  influence  relay  !R  of  all  vertices. 

IGS  <-  The  influence  graph  caused  by  S  on  G 
for  each  u  e  Vs  do 
IR(u)  <-  0 
end  for 

Compute  the  topological  ordering  ullu2 _ uns  of  vertices  in  t/s 

for  /  =  ns  down  to  1  do 

IR(ui)  <-  IR(uj)  +  1 
total  ■(—  0 

for  each  v  e  N~(ui)  do 
total  total  +  w(v,  Uj) 

end  for 

for  each  v  e  N~(ui)  do 
IRM  <-  ir(v)  + 

end  for 
end  for 
Return  !R 


Lemma  6.  The  CIR  algorithm  produces  the  influence  relay  for  each  activated  vertex. 

Proof.  We  use  induction  to  prove  that  the  influence  uk  is  computed  correctly  after  the 
loop  /  =  k.  Firstly,  un  is  computed  first  and  IR(un)  =  1.  This  is  correct  since  un  is  at  the 
end  of  the  topological  ordering  and  does  not  have  any  activated  outgoing  neighbors. 

Now,  suppose  that  the  influence  relay  of  un,  un_  1 _ uk+1  is  computed  correctly  after 

the  loop  /  =  k  +  1,  we  will  prove  that  IR{uk)  holds  the  influence  relay  of  uk  after  the 

loop  /  =  k.  Let  {uk,  u-l2 _ ulp }  be  the  set  of  activated  outgoing  neighbors  of  uk  which  is 

activated  later  than  uk.  Due  to  the  construction  of  IGS,  (ukl  uk),  (ukl  u,2) _ (uk,  ulp )  are 

edges  in  IGS,  hence  iq  >  k,  1  <  q  <  p,  in  the  topological  ordering.  After  the  /  =  iq  loop, 
IR(uk)  will  receive  a  value  of  ^ — ^(^.%)^(%) —  from  the  jnf|uence  re|ay  0f  u-  At  loop 

^zeN~  Ah(,z)<h(uiq)  W^Z'U'q> 

i  =  k,  IR(uk )  is  increased  by  1  for  uk  itself  and  equals  the  influence  relay  of  uk  according 
to  the  Eq.  4-2.  O 


84 


Time  complexity.  The  topological  ordering  of  a  directed  acyclic  graph  can  be 
computed  in  linear  time  and  the  number  of  updates  in  the  main  loop  equals  to  the 
number  of  edges  of  IGS,  so  the  CIR  algorithm  runs  in  linear  time. 

A  crucial  property  of  the  new  metric  is  that  the  total  influence  relay  of  seed  vertices 
reflects  the  influence  of  the  seed  set  as  stated  in  Theorem  4.5. 

Theorem  4.5.  The  total  influence  relay  of  seeding  vertices  equals  the  total  number  of 
activated  vertices. 

^  IR(u)  =  \Ad(G,  5) 

ueS 

Proof.  The  proof  is  based  on  an  invariant  of  variables  //?(ui), ... ,  IR(un)  in  CIR 
algorithm.  The  information  is  propagated  from  the  seed  set,  thus  ail  seed  vertices 
do  not  have  incoming  neighbors  in  IGS  and  occupy  smallest  indices  in  the  topological 
ordering.  Let  up  be  the  highest  index  seed  vertex.  We  will  prove  that  after  the  loop 
/  =  k  +  1  we  have: 

k 

IR(Uj)  =  ns  —  ky p  <  k  <  ns 

j= i 

Before  the  loop  /  =  n,  it  is  obviously  true.  After  the  loop  /  =  k,  the  value  of  variable 
IR{uk+ 1)  is  increased  by  1  and  redistributed  to  its  incoming  neighbors,  thus  J2j=i  ir(uj) 
equals  Y!j=i  ir(uj)  PIus  1  ■  It  implies  that  J2j=i  ir(uj)  =  ns  -  (k  -  1)  after  the  loop  /  =  k. 

After  the  loop  /'  =  p  +  1,  we  have  Y^=i  IR{ui)  =  ns  -  P ■  At  each  loop  /'  =  p  down  to 
/  =  1,  the  value  of  /R(u,)  is  increased  by  1.  Thus,  when  the  algorithm  stops  we  have: 

p 

IR(U)  =  Y  IR(Ul^  =  ns  -  P  +  P  =  |Ad(G,  S) | 

u£S  i=  1 

C 

Theorem  4.5  implies  that  each  vertex  u  e  S  contributes  IR(u)  in  propagating 
the  influence  over  the  network  G.  Now,  suppose  that  G  is  the  coupled  network  of 
multiple  ones,  we  can  sum  up  the  influence  relay  of  all  seed  vertices  of  a  component 
network  to  obtain  the  contribution  of  that  network.  Furthermore,  the  total  influence  relay 
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of  overlapping  vertices  indicates  the  amount  information  propagated  back  and  forth 
between  networks. 

We  can  also  adapt  the  influence  relay  metric  to  measure  the  support  between 
networks  in  propagating  information.  Let’s  consider  the  case  the  information  emerges 
from  the  seed  vertices  of  one  network,  propagates  to  another  networks  via  overlapping 
users,  then  comes  back  to  the  first  network.  With  the  support  of  other  networks,  the 
information  is  propagated  further  in  the  first  network.  If  we  consider  the  diffusion  process 
in  the  coupled  network  and  increase  the  influence  relay  by  1  only  on  activated  vertices  in 
the  first  network,  we  can  quantify  the  support  from  other  networks  by  the  total  influence 
relay  which  goes  through  other  networks. 

4.6  Experimental  Evaluation 

In  this  section,  we  show  the  experimental  results  for  coupling  schemes  and  use 
the  coupling  schemes  to  analyze  the  influence  diffusion  in  multiple  networks.  Firstly, 
we  compare  lossless  and  lossy  coupling  schemes  to  measure  the  trade-off  between 
the  running  time  and  the  quality  of  solutions.  Since  the  massive  influence  problem  is 
NP-hard  [17]  in  a  single  network,  we  use  the  greedy  algorithm,  which  provides  high 
quality  solution,  to  find  the  solution  after  coupling  networks.  We  also  investigate  the 
relationship  between  networks  in  the  information  diffusion  to  answer  the  following 
questions:  (1)  What  is  the  role  of  overlapping  users  in  the  diffusion  of  the  information? 

(2)  How  does  a  network  get  benefit  from  other  networks  to  diffuse  the  information?  (3) 
Can  the  diffusion  on  one  network  provide  a  burst  of  information  in  other  networks?  (4) 
What  will  we  miss  if  we  consider  each  network  separately? 

We  ran  all  our  experiments  on  Intel(R)  Xeon(R)  CPU  W350  machine  with  a  12  GB 
RAM  and  a  2.93  GHz  Quad-core  processor.  In  all  experiments,  by  default,  the  number  of 
hops  is  d  —  4  and  the  influence  fraction  (3  —  0.8. 
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Networks 

#Nodes 

#Edges 

Avg.  Degree 

Twitter 

48277 

16304712 

289.7 

FSQ 

44992 

1664402 

35.99 

CM 

40420 

1 75692 

8.69 

Het 

8360 

15751 

1.88 

Nets 

1588 

2742 

1.73 

Table  4-1 .  Foursquare-Twitter  and  co-author  network  data-sets 

4.6.1  Datasets 

Real  networks.  We  do  experiments  on  two  systems  of  networks:  Twitter  and 
Foursquare  ( FSQ )  networks  [49],  and  co-author  networks  in  the  area  of  Condensed 
Matter(CM)  [43],  High-Energy  Theory(Het)  [43],  and  Network  Science  (NetS)  [42],  The 
statistics  of  networks  are  described  in  Table  4-1 .  While  the  overlapping  users  of  the 
first  dataset  is  provided  in  [49],  we  match  overlapping  users  of  the  second  one  based 
on  authors’  names.  The  numbers  of  overlapping  nodes  of  network  pairs  FSQ-Twitter, 
CM-Het,  CM-NetS,  and  Het-NetS  are  4100,  2860,  517,  and  90,  respectively.  Moreover, 
while  co-author  networks  have  edge  weights,  FSQ-Twitter  dataset  only  contains  the 
network  topologies.  If  the  network  does  not  have  edge  weights,  we  assign  the  weight 
of  each  edge  randomly  from  0  to  1 .  We  then  normalize  the  edge  weights  such  that  the 
total  weight  of  in-coming  edges  is  1  for  each  node.  This  is  suitable  since  the  influence  of 
user  u  on  user  v  tends  to  be  small  if  v  is  under  the  influence  many  friends.  Finally,  the 
threshold  of  each  node  is  a  random  value  from  0  to  1. 

Synthesized  networks.  We  also  use  synthesized  networks  generated  by  Erdos-Renyi 
random  network  model  [26]  to  test  networks  with  controlled  parameters.  There  are  two 
networks  with  5000  nodes  are  formed  by  randomly  connecting  each  pair  of  nodes  with 
probability  pi  =  0.0008  and  p2  =  0.006.  The  average  degrees,  8  and  60,  reflect  the 
diversity  of  network  densities  in  the  reality.  Then,  we  select  randomly  f  fraction  of  nodes 
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in  the  first  network  as  overlapping  nodes.  The  edge  weights  and  node  thresholds  are 
assigned  as  above. 

4.6.2  Comparison  of  Coupling  Schemes 

We  first  evaluate  the  effect  of  the  coupling  schemes  on  the  running  time  and 
the  quality  of  the  found  solutions  when  we  use  the  greedy  algorithm  to  solve  MIP.  As 
illustrated  in  Fig.  4-6,  the  algorithm  provides  larger  seed  sets  but  runs  faster  in  lossy 
coupled  networks  than  lossless  coupled  networks.  In  both  Twitter-FSQ  and  co-author 
datasets,  the  seed  sizes  are  smallest  when  the  lossless  coupling  scheme  is  used.  It  is 
as  expected  since  the  lossless  coupling  scheme  reserves  all  the  influence  information 
which  is  exploited  later  to  solve  MIP.  However,  the  seed  sizes  are  only  a  bit  larger  using 
the  lossy  coupling  schemes.  In  the  lossy  coupling  schemes,  the  information  is  only 
lost  at  overlapping  users  which  occupies  a  small  fraction  the  total  number  of  users 
(roughly  5%  in  Twitter-FSQ  and  7%  in  co-author  networks).  Thus,  the  effect  of  lossy 
coupling  schemes  on  the  solution  quality  is  small  especially  when  the  seed  sets  are  big 
to  influence  a  large  fraction  of  users.  On  the  other  hand,  the  algorithm  runs  much  faster 
in  lossy  coupled  networks  with  the  factor  up  to  2  times  in  Twitter-FSQ  and  4  times  in 
co-author  networks.  The  major  disadvantages  of  the  lossless  coupling  scheme  is  the 
doubled  number  of  hops,  the  number  of  extra  nodes  and  edges.  In  co-author  dataset, 
the  number  of  extra  edges  are  relative  high  comparing  to  the  total  number  of  edges  in 
all  networks,  so  the  speeding  up  factor  is  higher  in  co-author  networks.  We  therefore 
can  infer  that  the  lossy  coupling  schemes  work  well  on  real  datasets  in  which  networks 
are  sparse  and  the  number  of  overlapping  users  is  small.  Next,  we  examine  the  effect 
of  the  number  of  overlapping  users  on  the  performance  of  the  coupling  schemes  with 
the  synthesized  datasets.  Fig.  4-7  demonstrates  the  results  on  two  networks  of  size 
5000  and  different  fraction  of  overlapping  users  f.  The  overlapping  fraction  significantly 
differentiate  the  coupling  schemes  in  terms  of  both  the  solution  quality  and  running 
time.  When  f  is  small,  the  seed  sizes  are  quite  close  with  all  coupling  schemes.  But 
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Figure  4-6.  Comparing  Coupling  Schemes  for  Finding  Minimum  Seed  Set  on  co-author 
Networks  (upper  figures)  and  on  FSQ  and  Twitter  (lower  figures) 


when  f  increases,  the  gap  between  schemes  is  bigger  and  bigger.  The  variation  of  f 
also  reveals  the  effectiveness  of  the  easiness  lossy  coupling  scheme  (the  best)  and 
the  disgrace  of  the  trivial  average  scheme  (the  worst)  among  the  lossy  ones.  It  is  more 
interesting  when  we  look  at  the  running  time.  The  running  time  in  the  lossless  coupled 
networks  is  initially  higher  than  in  the  lossy  coupled  networks  but  it  gradually  catches  up 
and  overtakes  the  later  networks  at  f  =  0.4.  The  key  point  is  the  size  of  the  seed  set. 
The  larger  f  is,  the  larger  the  ratio  between  the  seed  size  in  lossless  and  lossy  coupled 
networks  is.  As  the  running  time  depends  on  the  seed  size,  the  running  time  in  the 
lossless  coupled  network  reduces  faster.  Thus,  we  recommend  to  use  lossless  scheme 
when  the  overlapping  fraction  is  large  and  the  seed  size  is  predicted  to  be  small. 
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Figure  4-7.  Comparing  coupling  schemes  with  different  overlapping  fraction  f 


A  Co-author  networks 


B  FSQ  and  Twitter 


Figure  4-8.  Comparing  coupling  schemes  with  different  number  of  propagation  hops  d 


Since  the  seed  size  is  sensitive  to  the  number  of  hops  [24],  we  would  like  to 
evaluate  coupling  schemes  with  different  propagation  hops.  Similar  to  single  networks, 
Fig.  4-8  shows  that  the  seed  size  decreases  when  we  have  larger  number  of  propagation 
hops.  However,  the  lossy  coupling  schemes  deviate  more  and  more  from  the  lossless 
one  in  terms  of  the  relative  seed  size  when  the  number  of  hops  increases.  Let’s  consider 
the  ratio  of  the  seed  sizes  between  the  best  lossy  coupling  scheme  (the  easiness  one) 
and  the  lossless  coupling  scheme.  It  is  1 .05  (1 .1 )  and  1 .5  (1 .3)  in  co-author  networks 
(FSQ-Twitter)  with  d  =  2  and  d  =  5.  The  reason  is  that  the  lossy  coupling  schemes 
inherently  bear  the  error  which  is  accumulated  and  propagated  after  each  hop. 
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4.6.3  Benefits  of  Coupled  Network 

Coupling  schemes  provide  the  mechanism  to  study  multiple  networks  under  a 
consistent  view  which  helps  to  answer  different  concerns  about  the  influence  diffusion. 
For  example,  Fig.  4-6  shows  a  property  that  is  similar  to  one  in  a  single  network:  the 
seed  size  increases  super  linearly  regarding  to  the  influence  fraction.  It  means  that  the 
gain  per  seed  users  is  decreased  when  the  circle  of  influence  is  broadened.  Moreover, 
without  the  coupled  network,  we  may  need  to  find  the  seed  set  on  each  network  to 
influence  /3  fraction  of  all  users  and  union  them  to  obtain  the  seed  set  for  the  whole 
system.  Fig.  4-9  clearly  demonstrates  that  if  we  influence  each  network  separately 
we  would  need  a  much  larger  seed  set  compared  to  what  we  need  in  the  coupled 
network,  no  matter  which  type  of  coupling  we  use.  The  seed  set  found  on  the  lossless 
coupled  network  almost  has  the  same  size  with  the  largest  seed  set  found  in  component 
networks  in  co-author  dataset  and  even  smaller  in  Twitter-FSQ.  In  co-author  datasets, 
the  size  of  the  union  set  to  influence  0.8  fraction  of  users  is  24%  and  30%  larger  the 
size  of  the  seed  sets  found  in  lossless  and  lossy  networks.  These  numbers  are  23%  and 
47%  in  Twitter-FSQ.  The  reason  is  that  the  lossless  (lossy)  coupled  network  can  capture 
(partially  capture)  the  collaboration  of  networks  to  propagate  the  information  and  exploit 
it  to  reduce  the  seed  size.  When  we  find  the  seed  set  in  each  network  separately,  we 
ignore  this  property.  As  a  consequence,  we  endure  a  penalty  on  the  size  of  the  union 
set  which  is  high  if  networks  can  propagate  the  information  well  like  Twitter  and  FSQ. 
Although  we  can  use  other  methods  to  solve  MIP  without  using  coupling  schemes,  they 
may  be  more  complicated  and  cause  the  seed  size  increase. 

The  coupling  schemes  not  only  help  to  solve  MIP,  it  is  also  help  to  investigate 
other  aspects  of  the  influence  diffusion  in  the  system  of  networks,  origin  Due  to 
the  overlapping  with  other  networks,  we  may  underestimate  the  ability  to  diffuse  the 
information  of  a  specific  network.  It  motivates  us  gauge  the  viral  marketing  potential  of 
a  network  allowing  the  information  to  be  propagate  to  back  and  forth  to  other  networks. 
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Figure  4-9.  The  quality  of  seed  sets  with  and  without  using  the  coupled  network 


Specifically,  we  use  the  greedy  algorithm  to  find  the  smallest  seed  set  to  influence  (3 
fraction  of  the  studied  network’  users  in  the  lossless  coupled  network.  Then  we  compare 
it  to  the  seed  set  found  in  the  traditional  perspective  -  considering  this  network  as  a 
standalone  one.  As  shown  in  Fig.  4-10,  the  seed  size  decreases  up  to  9%,  25%,  17%, 
and  26%  in  CM,  Het,  FSQ,  and  Twitter,  respectively,  when  we  consider  these  networks 
in  the  connection  with  other  networks.  The  improvement  in  Nets  is  small  due  to  the 
small  number  of  overlapping  users  with  other  networks.  It  is  also  observed  that  the 
improvement  ratio  is  higher  for  network  with  low  conductance  of  influence  in  the  case  of 
FSQ  and  Twitter,  two  networks  with  the  same  number  of  users.  When  the  network  sizes 
are  unbalanced,  Het  -  the  network  with  the  smaller  number  of  users  seems  to  get  better 
improvement  ratio  than  the  bigger  network  CM.  The  back  and  forth  propagation  of  the 
information  between  networks  is  the  base  for  the  outside  support  of  the  target  network. 
When  the  information  is  propagated  from  seed  nodes  in  the  target  network,  some  nodes 
are  activated  in  other  networks  due  to  the  overlapping  nodes.  The  information  then 
is  propagated  further  and  even  comes  back  to  the  target  network,  hence  the  number 
of  influenced  node  in  the  target  network  is  increased.  Fig.  4-1 1  shows  the  amount  of 
influence  relay  that  other  networks  support  the  target  network  with  d  =  4  and  d  =  8 
hops.  The  support  is  considerable  and  higher  with  the  larger  number  of  hops.  When 
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Figure  4-10.  The  quality  of  seed  sets  with  and  without  using  the  coupled  network 


d  =  8,  the  information  has  more  chance  to  come  back  to  the  target  network;  the  support 
is  up  to  2.2%,  5%,  8.3%,  and  7.3%  on  the  network  CM,  Het,  Twitter,  and  FSQ.  The 
support  is  higher  if  the  information  is  easier  to  be  propagated  in  component  networks. 

4.6.4  Bias  in  Selecting  Seed  Nodes 

Here,  we  analyze  the  the  seed  set  on  the  lossless  coupled  network  to  observe  how 
much  each  network  contributes  towards  the  composition  of  the  seed  set  and  the  set 
of  influenced  nodes.  We  mainly  address  two  questions:  (1)  which  network  supports 
the  propagation  better  and  (2)  whether  there  is  a  bias  toward  a  network  selecting  seed 
nodes.  Fig.  4-12  shows  the  fraction  of  selected  nodes  as  well  as  influenced  nodes 
in  each  network  and  the  overlapping  part.  We  can  observe  that  overlapping  nodes 
tend  to  be  selected  in  both  datasets.  When  the  influenced  fraction  is  0.4,  the  fraction 
of  overlapping  seed  nodes  is  around  24.9%  and  25%  on  co-author  and  FSQ-Twitter 
networks,  respectively.  Note  that  only  5%  (7%)  total  users  of  FSQ-Twitter  (co-author 
networks)  are  overlapping  users.  This  shows  that  overlapping  users  not  only  play 
the  role  as  bridges  for  information  to  propagate  between  networks  but  also  have 
high  influence.  As  illustrated  in  Fig.  4-13,  the  contribution  of  the  overlapping  nodes 
in  influencing  other  nodes  is  high,  especially  when  f3  is  small.  Additionally,  there  is 
an  unbalance  between  the  number  of  selected  seeds  and  influenced  nodes  in  each 
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Figure  4-1 1 .  The  support  between  networks  on  the  influence  propagation  of  a  network 
with  d  =  4  (upper  figures)  and  d  =  8  (lower  figures)  hops.  C,  H,  N,  F,  and  T 
are  the  abbreviations  of  CM,  Het,  Nets,  FSQ,  and  Twitter. 


networks.  In  co-author  dataset,  the  biggest  network,  CM,  contributes  a  large  number  of 
seed  nodes  and  influenced  nodes.  When  /3  =  0.8,  76.7%  of  seed  nodes  and  80.5%  of 
influenced  nodes  are  from  CM.  In  contrast,  the  number  of  seed  nodes  from  FSQ  is  small 
but  the  number  of  influenced  nodes  in  FSQ  much  higher  than  Twitter.  Let  consider  the 
influence  fraction  of  0.4,  27%  (without  overlapping  nodes)  of  seed  nodes  belong  to  FSQ 
while  70%  of  influenced  nodes  are  in  FSQ.  After  nodes  in  FSQ  are  almost  influenced, 
the  algorithm  starts  to  select  more  nodes  in  Twitter  to  increase  the  influence  fraction.  It 
implies  an  important  characteristic  of  multiple  networks.  If  the  information  is  easier  to 
flow  in  one  network,  that  network  will  attract  and  propagate  more  information  inside.  In 
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Figure  4-12.  The  bias  in  selecting  seed  nodes  on  synthesized  networks  (upper  figures) 
and  on  FSQ  and  Twitter  (lower  figures) 


the  big  picture,  it  provides  hints  for  viral  marking:  overlapping  nodes  have  high  potential 
to  target  and  some  networks  are  more  efficient  to  advertise  than  others. 

4.7  Extensions  to  Other  Cascading  Models 
In  this  section,  we  show  that  we  can  design  lossless  coupling  schemes  for  some 
other  well-known  cascading  models  in  each  component  network.  As  a  consequence, 
top  influential  nodes  can  be  identify  under  these  models.  In  particular,  we  investigate 
two  most  popular  stochastic  diffusion  models  which  are  Stochastic  Threshold  model  and 
Independent  Cascading  model  [34], 

•  Stochastic  Threshold  model.  This  model  is  similar  to  the  Linear  Threshold  model 
but  the  threshold  O'(u')  of  each  node  u1  of  G‘  is  a  random  value  in  the  range 

[0,  Q'(u')].  Node  u1  will  be  influenced  when  Y^v<eN-,veA  w'(.v',  u')  >  O'(u') 
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Figure  4-13.  The  influence  contribution  of  seed  nodes  from  component  networks 


•  Independent  Cascading  model.  In  this  model,  there  are  only  edge  weights 
representing  the  influence  between  users.  Once  node  u1  of  G'  is  influenced,  it 
has  a  single  chance  to  influence  its  neighbor  v'  e  N+(u')  with  probability  w'(u',  v'). 

For  both  models,  we  use  the  same  approach  of  using  user  vertices,  account 
vertices  and  the  synchronization  between  user  vertices  and  their  account  vertices. 
Specifically,  the  weight  of  edge  (V,  u1),  0  <  /  /  j  <  k  will  be  0(V)  for  Stochastic 
Threshold  model  and  1  for  Independent  Cascading  model.  With  this  assignment,  if  u1  is 
influenced,  u1  will  be  influenced  with  probability  1  in  the  next  time  step.  The  proof  for  the 
equivalence  of  the  coupling  scheme  is  similar  to  ones  for  Section  4.3. 

4.8  Summary 

In  this  chapter,  we  study  the  massive  influence  problem  in  multiple  networks.  To 
tackle  the  problem,  we  introduced  novel  coupling  schemes  to  reduce  the  problem  to 
a  version  on  a  single  network.  Then  we  design  a  new  metric  to  quantify  the  flow  of 
influence  inside  and  between  networks  based  on  the  coupled  network.  Exhaustive 
experiments  provide  new  insights  to  the  information  diffusion  in  multiple  networks. 

In  the  future,  we  plan  to  investigate  the  problem  on  multiple  networks  with 
heterogeneous  diffusion  models.  In  particular,  each  network  may  have  its  own  diffusion 
model,  the  question  is  how  to  represent  them  efficiently.  Does  there  exist  a  method  to 
couple  them  into  one  network? 
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CHAPTER  5 
CONCLUSIONS 

In  this  thesis,  we  study  the  problem  of  identifying  granular  nodes  of  the  cascading 
propagation  in  networks.  Under  the  cascading  effect,  these  nodes  have  a  strong  impact 
over  the  network.  It  is  crucial  to  detect  such  nodes  to  serve  various  purposes  e.g.  the 
economical  gain.  When  the  studied  networks  are  social  networks,  we  can  use  nodes 
as  the  target  for  advertising.  On  the  other  hand,  if  networks  are  infrastructure  on  like 
power  network,  communication  networks,  etc.,  we  can  protect  these  nodes  from  being 
attacked.  For  each  kind  of  networks,  we  propose  efficient  strategies  to  find  such  group  of 
nodes. 

In  interdependent  infrastructure  network,  we  introduce  a  new  centrality  for  coupled 
networks  and  utilize  it  to  detect  most  vulnerable  nodes.  In  addition,  a  efficient  greedy 
framework  is  proposed  where  the  pure  greedy  and  centrality  measure  are  combined  to 
provide  a  better  solution  in  shorter  time.  In  multiple  online  social  networks,  we  design 
a  novel  framework  to  find  top  influential  users.  In  particular,  novel  coupling  schemes 
are  designed  to  reduce  the  problem  on  multiple  networks  to  one  on  a  network.  As  a 
consequence,  we  can  apply  existing  solutions  for  a  network  to  find  most  influential 
nodes  in  multiple  networks.  It  is  a  crucial  connection  which  shows  that  solving  the 
problem  on  coupled  networks  is  as  easy  as  on  the  single  network.  We  believe  that 
the  coupling  schemes  can  be  extended  to  other  models.  Finally,  we  investigate  the 
cascading  failure  under  load  redistribution  model  in  power  networks.  A  new  cascading 
centrality  is  designed  specifically  load  redistribution  model  and  can  be  used  to  detect 
most  critical  nodes  efficiently.  Moreover,  we  propose  the  cooperating  attack  strategy  to 
evaluate  the  weakness  of  networks  even  when  it  is  designed  to  tolerate  node  failures. 
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